 Now probably doesn't need any introduction. He is a food hacker extraordinaire from what I hear He is also the director of open query and minus QL support business. So please welcome iron lens It's good crowd. I like speaking to LCH. The rooms are big and you get more people. That's that's excellent Apparently I used up some credit getting this item into the program Because unless I talk about the actual code behind it and it wouldn't necessarily be an LCA talk So I will indulge later on and and explain how how I did it and yes, you're welcome to look at the code It's out there GPL, of course So yes, my name is Arjen. I run a company called open query. It's based here in Brisbane. So welcome to my hometown if you're from elsewhere and Years and years ago. I was working on a web project and I was just mucking around in in my school and php and I never actually finished that project. It never happened I was my browsing the my square website because of it and that's how I got my job with my squirrel So it some benefit came out of that project another sideline was that I hadn't solved that Niggly problem yet and the problem was that I was dealing with hierarchies in my squirrel menu trees to be exact and to be exact and It just wasn't working the way. I liked it and I was looking for a better way and a couple of years later I worked it out how to do it conceptually and There was no way to make that work at that point in my squirrel and later on the director of architecture Brian Haker told me It couldn't really be done the way. I was thinking it could be done I proved him wrong and he was very happy with that He's good like that. We're good friends So anyway, it took a long time and essentially took about six years to To develop and by the time I actually implemented it at long left my squirrel the company that is so What we have now is a graph engine. So you may be aware that my squirrel supports multiple engines. There's different bits of code that can be used to store data some store information in memory only Some store things on this some store things on multiple discs probably oh someone around of something now with phones blipping and there's also other differences in terms of locking transactions lots of different attributes of the architecture and Each of these engines is suitable for different Purposes for instance my ISAM engine was originally built for data warehousing So it's not only good at lots of concurrent select But also at very high speed in search particularly bulk inserts. That is what it's good at but from a single a single thread Primarily multiple threads will tend to slow it down because of the right concurrency Implementation in there into the bees very good at transactional Operations and that will be what most implementations over the most production implementations use So the graph engine or OQ graph for short Does something like that except it doesn't store regular data And I will look at where it came from and what it does and how it does it with some funky demos along the way So the things we're trying to deal with our hierarchies The usual things could be menu trees, but it could be it could be some complex scientific data as well That happens to be a hierarchy. So think very broadly. This is just to give you an idea But typically menu structures, but also organizational trees simple organizations not matrix organizations, of course So Dilbert reports to the point here at boss. He probably doesn't want to but anyway The other things we're dealing with our graphs from my perspective they're more interesting and A tree is essentially a simplified graph anyway. So it's kind of the same thing And that's Become rather more important in in more recent years because of the social networking Thing it is it is fairly useful on a website or for lots of other projects if you can actually connect people and things to each other and and many websites at the moment are avoiding this because most of them run on a Relational database, but even if you don't you would need to implement something yourself or Procedurally deal with it and it either becomes very slow or a lot of work Let's put it that way and this might actually be a solution and I have a demo of that later so Typical point the typical question that you might want to ask in a network or tree is Who approach to the point here at boss and then you get a list of employees So who reports directly or how many people report directly or indirectly and then you get a subtree or What is the path from this point to the root entry? So that's a whole chain So from me how many steps between me and the CEO why my company that's about zero, but anyway, you get the idea And of course you can play the six degrees of Kevin Bacon now. I have to admit I could probably run the six degrees of Kevin Bacon. I've got the filters on the Movie database data set to actually make this work. However, I don't have enough RAM in my machine at home to make this work 8 gig was not enough Just because I need to keep the current implementation wants to keep all that data in memory and I can't keep the entire graph in the RAM the way it's currently implemented So I need more than 8 gig. I got a fair way I did import as much as I could and then did some careers and in some cases You couldn't find a connection in some cases it did maybe longer than it would otherwise have been So yes, it it can work. It's about four. What was it like 45 million? Connections if you if you strip out DVDs and stuff if you stay purely with like only DVDs If you stay purely with movies and strip out some other nonsense But also what's the shortest path from A to B? So essentially you can re-implement. Well Facebook linked in and so on I'm not for a second by the way suggesting that you should be trying to clone Facebook because that's been done both by Facebook as well as by others but Websites can benefit from social networking Facilities and there's plenty of of ways to interact with the social network of people You can you can access people's Twitter Twitter information And connect through there for instance You can actually use the Facebook API to get some information out of there in any case. There's plenty of ways By which people would be able to connect to their friends on your website without them having to re-enter their whole social graph It is out there already. It is accessible Okay, so how do you deal with that in SQL? It doesn't fit particularly well. There's the adjacency model Which either does a fixed maximum a fixed maximum depth I mean you need to do joints left joints or subqueries and you kind of code it that way So you have a maximum depth that you're dealing with or you need to deal with recursive queries So you're essentially Procedurally dealing with it not particularly nice. It can be fairly fast But it's just not very pretty code Oracle has connect by fire and that's just an Oracle as it was various other service have have different things to do similar Similar functionality. It's cool 9 to 9 has a recursive union. I Despise the syntax that it uses but the people are used to it love it And that's perfectly fine postgres supports the recursive union So it can do some of these things out of the box absolutely no problem at all. Okay Then there's nested sets there architecture a little bit more complex But you can do a lot more nice tricky you can very easily find an entire subtree That is that was one of those very very easy queries with nested sets I won't go into the details of how that's implemented But basically inserting entries takes a little bit more work. It might require some shuffling in the in IDs and that sort of thing The other thing is it cannot deal with any graph. It has to be a plain tree So if any entries if any entry has two parents, it doesn't work You can use a materialized path and the simple description of that is One slash three slash four slash ten and essentially you you write out the path and use it as a string And that is actually remarkably functional. It seems a bit stupid, but it works and One implementation that you may have seen in the wild Is in easy publish and Norwegian? well enterprise CMS Uses that and it actually works remarkably well, but again it can't really deal with multiple parents without without other hacks Graphs you just have to handle programmatically as in you look at an entry and you see which other parents it has or which other links It has and you just walk through it backtrack and and so on so you need to just work through that You could put that in a stored procedure that works as well But it doesn't necessarily make it faster or nicer to handle So what is oq graph then? It's Something as I mentioned that developed by me in concept. I hacked around in the middle of the night and various other bits of time To get the basics going and then I needed someone smarter to actually plug it in properly into the MySQL server So the person sitting next to me there is Anthony Curtis. He's been hacking on storage engine for over ten years He's probably written more than any of us and He's very good at that kind of thing So it is a storage engine It doesn't modify anything in the server itself So we haven't mucked around with the optimizer. We just tell it a few lies to make it do what we want But that's what all engines do essentially. They just tell you just tell it what you want to do and We haven't changed a scroll syntax or added to it. That's not necessary and that was very important to us Because initially we had no idea how this was going to get if it was going to get integrated into my school If so how and the less intrusive you are the more times you have of course if it's a clean plug-in from a certain angle Then there's a higher chance if you're not interfering with the the other versions or all the other stuff going on okay now So it's technically an engine. However, it's not actually storing regular data the only thing we're storing is essentially Node IDs or the idea of a person and the link between them So we have an originating node and a destination node and an optional weight That's all the rest of a person's data or whatever data you attach it to would be in other tables But heck, that's what a relational database for you have other tables and you can join on them. Okay so The name that Anthony Curtis came up with is it's a computational engine. We stick some data We stick bits of information in and it computes whatever we want in this case for instance a sort of path or a Related term a related type of query. Okay, does that make sense so far? So it looks like a table from the user perspective and technically it is relational We we query it and outcomes a table. We juggle around with with the sets and Return a set as well. So that makes it relational but What you get returned Is not a subset of the rows that you put in It is based on the rows you put in but it will look a little bit different. You'll get to that in a moment Okay, so there's a data set that you put in and you can get it back in a clean way You can retrieve the data set for a dump But once you start making that that engine do tasks your result set Does not have much resemblance to the actual table inside because inside of course, it's a graph and not a table So the specified number of rows that you put in you don't get back a subset. So you may get back a funny a funny number The table that you're working with actually has more columns Then you are able to put data into there are some that have a special function. We'll look at that in a moment and The indexes that the table structure has are a lie. They don't actually exist We just need to tell the optimizer that they exist. Otherwise, it makes the wrong choices and you'll see why in a moment So you can kind of see there's a magic view. We just need to make sure that the environment does the right thing for us How would you install it for my school five? It's a bit of a pest because we do need to hack things into the server for my school version five You can't just add a new storage engine. You need to change the parser for it So you need to plug into the parser add some add some Lex key words and then put it into bison in the in the in the Grammar of the day school language and so on so it needs to be put into various bits It is a fairly simple patch, but depending on which version of my school you're dealing with you need to modify the patch So what we've done is put it in the art our delta enhanced Builds so that's where it's available if you still using my school five This would be one place to get it you just replace your binary with this binary or package They're Debbie and an Ubuntu and redhead sent those packages and that just works and you can see where do you have that particular engine by doing show Global variables like have our q-graph and that that will be set to true. That's one of those little patches that we put in That's the normal way of doing it in version five. So that's the that's the quick hack current production version of my school Depending on what you're using is five point one. It's it could also be five point five from from Oracle now And if you're using Marie to be it would be five point two Now the really cool thing is at least from my perspective that Marie to be five point two has OQ graph built in and that's just really really nice the problem with Plugins from five point one and above is yes, they're plugins. You could build Whatever you're plugging in separately and then just say install plug-in and so on However, you need to get a number of compiler switches exactly right that it matches up with what the main build was and to make it even more fun There's no versioning of the API So if you don't compile it against the exact same version of the source code that my school used for that particular build And yes, you can find you can find out Things may blow up or they may not anyway, it's it's some things will happen I know from other people writing engines that this is a complete pest So I would really really discourage these and that's why I'm really really happy that it's just built into Marie to be It's not compiled into the binary What has happened is that it gets built in the same built environment and ends up as a shared library in the tree and gets installed and Then you still need to run that install line to make sure the library gets loaded by the server So if you just install Marie DB, okay Graph will not be loaded will not take up any memory or do anything and then when you load it it is available and you could use it to To to deal with graph tables You can see whether it's installed by doing show plugins And it will be listed in there and show storage engines because it is a show storage engine Plug-in for drizzle a port has been done We haven't actually worked on it because they have been modifying the API so much I want to wait until it settles a bit And then hopefully someone else will pay me for Cleaning it up again. I just don't have the time for it right now, but The original port was done by someone else without telling us beforehand So quite likely someone else will do this again and it will just happen But they're probably waiting for it to settle to not have to do it every weekend What does the table look like it always looks like this Okay, so that's it may appear a bit a bit funny But like I said, we're not dealing with a regular table. We're not dealing with regular data We're dealing purely with the graph Data all the other data would be in a separate table and in fact that separate table will probably also have that link information But you would copy that across to this graph table maybe maybe via triggers Maybe periodically depends a bit on how often that information would get updated so I Think the next page explain kind of what it does, but just briefly to go back You have a latch will get back to that in a moment or rig a Desta D and wait are the ones you actually put data in Seq and link ID are other output variables that we'll deal with in a moment those two Indexes make sure that my scroll will never attempt a table scan to resolve a query If it were to try doing a table scan it would get back to your original table Information rather than what we're trying to compute it needs to actually address the different functions in the API that do Primary key lookups direct lookups and for that you need to pretend to have hash table hash index And that's what we have or at least pretend to have It's quirky isn't it? That's what you get when you talk to Anthony Curtis he can do things with the optimizer without telling the optimizer It's really really good So you have a rig a Desta D. That's a link between the two. It's directional You're always creating a directional graph at the moment if you want a two directional link You just insert the same but opposite as well So you're doubling the number of links that you have if everything is is Bidirectional there's an optional way to default is one. Maybe the default should be zero. I'm happy to debate that one At the moment. It's it's one None of the other columns actually exist. We're not storing them. We're just using them for input and output And we'll get back to that So to insert something this is an example. Okay, we're inserting into foo our typical demo table Rig a D in this study. We don't care about the other columns because they don't exist anyway and the default Weight will be one one comma two two comma three and so on so we create a tiny little tree of well Items people whatever their IDs might be We can select that back. By the way, I could do a live demo, but it does not serve a particular purpose I've just cut and paste this earlier So when you select that that's the information you get back now, you already see there is some extra information That is in there as you queue starts counting for you. It's a sequence like an order in so normally in relational day in relational Structures row rows have no order. So unless you use an order by you have no Control over the order in which the rows come back. So if you don't use order by they may come back in any order now If you use order by then you have to order by something That is already in that row. Otherwise, you can't decide. So usually it's someone's name or an ID or something You'll see later in the results that we really need to know in which order entries come because those are the steps From A to B. Therefore, we stick in the sequence Then you have then you can order by acq and all will be well So we need to provide our own ordering so that you might use it. Okay Let's just now because we're not doing anything with it there and link ID equals now as well and the weight is one as we Well, we didn't specify it and it ended up to the full now. We're going to do magic right still with me We are setting latch to one and then we're saying I want to find the shortest path and let's equals one says that I'll explain that in a moment Origa D equals one destination ID equals six and you see in the output latch equals one origa D equals One and that's that equal six so we're relationally still kind of correct Yeah, based on the query you get the output the fact that the table itself in the storage Didn't actually have a latch equals one anywhere stored and we didn't filter it that way to get it back side line Yeah, applications don't care about that my squirrel doesn't care about that at that higher level or the optimizer So we can kind of fudge fudge around with that Then you might see the steps. Let's see if I have a cursor there. That's good. I don't need a laser pointer So those are the steps step zero is the originating step then a step one step two step three Okay, now, how do we actually take the steps here the actual steps now? So we start at item one we walk to item two item three and I'm six that is our path Okay, we can clean it up a bit. There's a nice function called group concat from my score version four point one and up and Inside and we can use that to essentially create a little bit of a pivot So we can turn the column link ID into a single Column and then concatenate it with whatever item whatever string we want it defaults to comma and We can apply an order by we order by execute of course otherwise our path will go funny So we get a path from one two three six. You could also insert arrows. We do that later just for fun But this this shows you a quick path thing now. The reason I named it latch is who who who we're also does electronics Some of you you will be familiar with latches you control Microchips with that and you can tell the chip what to do with certain registers based on what value is in there? That's exactly what we're doing That is precisely what we're doing. So let's equals one tells the engine To do computations based on the extra shortest path algorithm That's it if latches null then it will deliver back the original table. That's the idea so we can add in an infinite number well limited by the by the by the width of that field of course of Values we're probably going to make that an enum so you could address it via the number or a name that that's Probably nicer and the problem is people are already using this so we need to be a little bit careful How we hack it, but we can turn it into an enum and you can still use the numbers anyway other searches you could do If you don't specify an originating ID it finds all paths to the destination ID So from which I from where could I get to node 4 not from everywhere because they're one-way paths Okay, she can't get from item 6 to item 4 that doesn't work But you can get from item 1 to item 4 so from 1 2 and 4 you can arrive at 4 Where can I get to from? item 4 Well, you can get the 4 5 and 6 I didn't return them in any particular order So it comes out of 654 in this particular case by the way if you like a copy of these slides That's perfectly fine to I see some people making notes. That's that's cool, but the slides are a public So this is the basic Mathematical background of of it and how fast it would be makes sense for more Makes more sense for some people than for others. I tend to not care about those things pragmatist here SQL update and the lead is also possible However, don't modify the path you're traversing Because the extracts of course needs to backtrack once in a while and you could find yourself lost in a wilderness and The engine is no way of knowing or caring and the amount of code we would need to spend on making it care is not worthwhile So yeah, you don't change the path while you're walking it You could retrieve a ticket in the temperate table and then modify it in a separate query in some way But know what you're getting yourself into modifying your table while you're walking it not safe And that sense it's of course different from a regular table if you modify a regular table You only look at each row once in this case. You're not dealing with a regular table, but with magic Now how to make this information pretty? How do we actually tack it on to real-world information that you want to output rather than the numbers? So we insert some some stuff from mash into another table and then we do a join Does this make sense for everybody? Yeah, we just join it on to a second table. So we have a path from Pierce to Mocay Or you can use some RDF data. Why not? And this was one of the simplest yet fun reasonably sized data sets. It's about eighty nine thousand Somewhere between eighty nine ninety two thousand Entries, it's a tree. We want to walk it in any direction. So I inserted twice with reverse links So you end up with a hundred and seventy eight thousand edges, okay? So we insert that twice The information for this is all it's all available. So toll tollweb.org has the basic data set What we have got is a little a little XSL to to mangle that Into something that we can use and then we insert that so all the transformation stuff is is available And we well we create toll with XRL which we import So what do we can what can we do now? Where are so much APNs? Where do we reside in the tree of life? So three toll web essentially is one of the implementations of one of the projects on the web For looking at all life on earth and seeing how it connects not all Branches or or let's say intersections in the tree are named. I'll mention that now So this path is actually 76 steps, but not all the steps are named And I made though in the import In the import process I made those fields null Because we're using a group by function. That's group concat It'll leave out the nulls Very convenient magic Okay, so yeah, we go from life on earth and we go into the into the animals and so on vertebrates Mamelea hominids and then you end up with homo sapiens. It does work out. So it's a path from the root entry Down down the tree you could do this Via other programmatic means or other structures. This is all still doable after all It's a tree, but we're going to do more funky stuff that takes a little bit more effort You really really want to know now how we relate to bananas Okay, by the way, all of this only applies if the earth is less is more than 6,000 years old Otherwise all all bets are off on this one Yep Okay, how do we get from homo sapiens to to that family of banana trees? I used Wikipedia for this by the way You go all the way up to the unicorns and then then you go the other path and end up at At the in the tree at the parkment. So that kind of works if you've gone up the tree and down the tree now Kind of pretending it's a graph. So that seems to work. That's okay Again, there's more steps than this, but it just comes back in I don't know a couple of hundreds of a second on my Laptop there's no particular sense in in measuring that it's just a very fast query Now we've dealt with the bananas. Let's do something else The thing to do in programs, of course You want to build a maze and then traverse it and we can now do this in SQL does this have any use whatsoever? Of course, it doesn't this is pure nonsense But it works and yes the code is available You can build your own mazes over millions by millions of things and try to solve them on your SQL in your database It's it's it's fun nonsense. So very simple example this is a five-by-five and It finds a path from there to walk around and so on So I called us dextrash mouse This is too simple. So we do a thousand Hang on. Yeah I've done a thousand by a thousand It didn't want to load before I stuff something up in my in my loader routine on this machine So I'll just have to tell you I Made a thousand by a thousand maze Dumped it into an SQL file so it then can load that it has a couple of million. What does that do? So that creates a million rooms Thousand by thousands equals a million which means you have two point something million doors Between the rooms which means that you have doubled that number in actual parts because you can walk that door in both directions and When you actually diverse the path you have Something in the order of hundred something thousand steps to get from a to b usually it depends a bit because of course It creates a random path when you when you generate It still comes back in under a second Yeah point point seven or point eight of a second on this two gigahertz dual core and for a single query my scroll only uses a single core so It can do that kind of query to actually take those hundred thousand steps or Report those hundred thousand steps back to you. Of course, it's looked at many more rooms I'd have to add on some debugging to figure out how many it has and again It's dependent on the maze but it's looked at hundreds of thousands of rooms Maybe most of the rooms in the million even and then worked out which hundred thousand are there and it still comes back in less than a second so again It's absolutely useless and senseless to play with mazes in SQL It makes no sense whatsoever But it proves the point that you can have a fairly large data set and look at it Efficient fairly efficiently and get some a decent result that you can then using your application and that was the that was the point You can use a query like this to actually make it make it look good in the in the output in on input I linearized the The the x and the y coordinates of the of the rooms and then on output I rip it apart again. So a bit of basic math How much memory does it use and how does the engine behave? It uses 60 bytes per edge about and We use the boost graph library Internally so we haven't actually re-implemented all those graph algorithm Just use what the with the boost graph library Delivers now having said that we're probably gonna rip that out again Because we're thoroughly fed up with c++ templates. They're not really really shiny. I despise c++ already Not good. I can't even add new algorithms now. I get completely lost. I need Anthony's not just help. I need handholding Virtual handholding. He's in Los Angeles, but still It it does not make me happy in if I'm having trouble doing this and I understand how the thing works Other people are not going to add algorithms And that's a very serious problem because at the moment we're not getting through Contributions to this even though people are interested the infrastructure is just too complicated So if any of you are c++ and template capable, please help me while we are still in template land Because ripping out the entire back end of the engine and using another library We have found another little C library that does this stuff Yeah, while that hasn't happened yet, we do need to we do need to deal with those issues. I think Could someone grab the power supply from the bottom of my pack, please and yeah somewhere in the bottom rolled up Just in case I think I should have enough power Never know So there's a plug there and yeah, okay So it behaves like a memory engine it does not use the memory engine I want to make it absolutely clear someone got into the confusions about that. Yep. That seems to work. Thank you It uses table-level locking which means that whatever operation someone does If it's a right everybody else will have to wait if it's a read other people can read the table or run a Search but writers will have to wait. So that's the basic the basic locking strategy at this point You have to remember it's all in memory. So it's really really fast anyway It doesn't matter that you have that kind of looking in this particular case There's no persistence that is if you restart your server the table will still exist But it will be empty because it is all in memory As I mentioned before you probably have the structure of the links in another table and you copy that in You can copy that in on startup and there's ways to do that there's a function that there's an configuration option called init file and If you specify that in that file, it will be read on startup and execute SQL queries from there. So you can actually say Insert into a certain table select from a couple of columns from another table. Yes, Lindsey It just works you could Or you think you're saying if you restart one server and it happens to be the master Okay, so the question is how does it work with replication? It would also replicate and yes, it would cause trouble if you're not needing to initialize a slave you could if you install this on all Items in the environment including the slaves so they all initialize themselves then there's a statement for a thread You can actually disable binary logging SQL log bin equals zero set session SQL log bin equals zero Then run this stuff and then turn it on again That would that would solve that problem Does the trick that is I think the quickest way at this point to deal with it Of course, we would like a persistent version, but that just would take more development time. We want more people using it There are some people using we'd like more people using it and then it's only actually required for bigger data sets Because reloading this kind of stuff up to a couple of million entries is not really a problem So and you again you duplicate the data anyway. You can also use triggers For inserts updates in the leech to when when you change your original data to update this file That will not make it persistent But you can combine it to to reload your data as well as keeping it up to date so that can work Okay, it doesn't have transactions either So if you combine the graph engine in a transaction with something else and insert will not be consistent between the two engines They can't they can't synchronize like that Okay, those are the links for the source code and so on what I'll do now Is give you a practical example of how to actually use this stuff? Welcome to friend phase. Yay No, I don't watch the IT crowd, but there's people around me who do The this is Drupal Is this readable? Is it a bit small? I might make it a bit bigger. There we go step work Yeah, it walks off the end of the screen. I might make it smaller again. Sorry You'll have to squint a little bit So the original there's a there's a friend list module inside Drupal. So you can just install that there's a module for that What we added is a friend list graph module which was implemented early last year by my good friend and colleague Peter Levedink of wego on IRC and he implemented that in about I don't know an hour total that's what made this thing work and Earlier earlier this week. He took my machine and built this website Okay, so that's how quickly this goes and he took about an hour on that So he essentially cloned Facebook within two hours in Drupal. Not a problem. Okay? Seriously, you can't do this kind of stuff without this engine. So just just to give you an example We put in just for the heck of it 250,000 users and we gave each user I think three random connections and of course other people will link back to the same person So each person will have more than three connections by now So can someone give me a number between one and two and a 50,000 Number 42 must have the number 42. By the way, I'm turning 42 next month. Yeah You would know this because two years ago I lost my hair for turning 40 is so that yeah, that was two years ago so other number 1337 I can certainly do that Absolutely, why not find a relationship? Okay, we've got that you saw how long that query took not very long at all. What have we done? We've done three shortest path queries. The first is how does this user and yes, they're random random lease Created username. How does this user connected that user? Well via that path. Yeah They're slightly more than six degrees of separation But you get the idea if you do this in a real social network You would get your six degrees usually if you have a fairly decently connected one So this one is actually doing more work than you would have to most people have more than four connections is the point We also doing the six degrees of Linus one must and So from a to Linus and then from Linus to be so the total is three queries you saw the time that it took We can grab some other numbers Yeah, no, that's a bit too much Okay, we can load that and that's the result. Okay Works doesn't it? So people can now have friends and but they can also have fans. So fan is a one-way Connection and that's just what something that friendly support friend friend is a two-way connection What can you do with this? Well? when you look up Let's go to the home page There's some random blog entries that we're really interested in and I have no idea what this means it actually it means nothing It's fake laughing but So this blog entry has an author right there On the right-hand side, there's now a block which is essentially a custom query a view inside Drupal with a couple of a couple of clicks we create that kind of stuff and We now can tell how I relate to that particular author in On this website, so that's one query that is just now possible to do and you can do that with any content on the site Okay, that's about it for demos and good fun. I'll take questions and I have one minute for the questions or five minutes Excellent that sounds good So what if there are two? Shortest paths which have the same length does one of them get returned Can you get both returned and can you find all the paths between two points? Yes Yes, yes Yes, yes, no at the moment you will get a shortest path You don't get well, which may be the or a Okay, if there's multiple you don't know which one you're getting because essentially doesn't matter that much you can't at the moment get all We've looked at this and we need to do something quirky in the result sets to to do this or add an extra Column or some extra. Yeah, I don't necessarily want to add more columns It makes it because there's always more algorithms that Require more of this stuff and I want to keep this simple Yes, we don't want to fix the extra columns one for each algorithm Well, we can always overload and and there's use lots of lots of tricks about it There is a way in my squirrel to return multiple result sets It that's already available because if you do straight select statements in a store procedure that you don't store anywhere They get tossed back at the client. So there's an API for that from version 50 and up now If you were to do that from the engine, that's entirely doable You will you can specify there's more results that coming and each of the result sets would be a path That's entirely doable. However, the current infrastructure inside the engine doesn't support this So Anthony knows how to do it and this was his suggestion We don't actually necessarily want to do it. So I mean, yes, it's a valid question But unless you ever really I need this now That's an old next question Given that RDF is a three hypergraph and in fact most implementations use a four occasionally even a five hypergraph How do you how are you encoding that in your standard? by graph Implementation that you have that you that you demonstrated here. Well, what you see is what you get So it is quite possible that some some forms of RDF data are not easily or not at all representable So inside my system where you'd have to do something smart. So that tree of life was a Had only one predicate. No, there is that I think so, right? Okay, we could we could look at that, but yeah, it can get I mean if the result is something that looks like a graph It is mappable But you may need to do quite a bit of transformation to make it. So I mean the RDF data needs to run through your Starsheets and some other magic to to make this work So I think it's doable for most for most graphs, but there might be a lot of work involved in actually making so But once you have those transformations for a data test, they're repeatable Actually return question because you're the RDF expert I mean in the end it's all the graph isn't it? So am I correct in in presuming that apart from the transformations there are no intrinsic problems to mapping it? Well, the intrinsic problem here is that RDF well, technically a graph isn't a graph in the traditional node node and a direct and a direct vertices in fact, it's a each edge in the graph has three vertices in it and and In most RDF implementations, they actually put a shove a fourth one in there because It turns out that for all practical purposes if you want to implement anything three isn't enough and So, yeah, it isn't actually representable in any sane way As a standard biograph. Okay in that case the conclusion is you'd have to do extra trickery We've had this question from a potential client with who essentially needed to attach extra attributes to a Vertex to a connect to a link and the answer at the moment is no we can't they would they actually needed to do fairly complex Computations on a link to work out which one was more interesting than another. That's not something we do Right now. It's definitely possible and you could hook in all kinds of trickery, but it's not something that's done right now There's no technical limitation on on getting it done. So if it needs to be done it can be added in I'm just a bit hesitant in in feature bloating the the basic idea because it's already very useful for lots of purposes It may not be it may not be beneficial for all RDFs, but heck there's RDF data storage and you wrote one yourself So Let's please use those things also before the next question. I should mention. There are perfectly good Graph databases one is Neo4j, which seems to be fairly popular And there there are others and there are different types of licensing that that's all fine the thing is Many or most of us already have most of our data set inside my scroll So what I've seen in the wild is people have enough a my scroll data set then grab out part of the graph Put it in Neo4j do their complex Graph searches grab out the result put it back in my scroll join it onto other data and then display it now Given that scenario here's a more convenient version of the same thing The actual speed of the search will usually be the same until you get to extremely large graphs in which case we run out of memory and they doubt Next question last question or um one more. Okay. Yeah. Anyway, that was kind of my question was are Are you able to tag the relationship somehow so that you can say it's you know Yes, I'm related to my children and I'm related to my parents, but that's a different type of relationship. I Would do that via wait That's how I would encode it for instance on this friend face thing I would encode it by a weight and then some are more weight higher than others in the search I don't know whether that's really the suitable way it is The weight is a float so you can do it in fraction. So it doesn't actually matter that much. I don't know we could add extra attributes whatever you want but it Tagging that kind of stuff unless it makes a difference for the search Shouldn't be in this table if it's just an attribute of the link Then it shouldn't be in the end and it just just be in a regular database table and Then the link is in here with just the standard weight So unless something matters for the searchers, I think should a weight be more I should the weight be more or less that that would be relevant then Yeah, you don't want to put it in the graph table last question. I believe Taking into consideration when now can you repeat that because you're How is weight taken into consideration when calculating shortest path? The low from memory the lowest weight wins So it's it's it's added up as you as you walk and The path that has the least weight Tends to win So that's taken over the entire length of the of the track Which of course if you have different ways makes the the story more complex It's more difficult to actually track whether it's done the right thing or not Yes, I think it is and I think in some cases you don't want it to be a sum I've already had that question You want to you want to make it into a product or something? Yeah Doesn't do that at the moment that would be a separate algorithm But again, you can add an algorithm you make you make number three number four number five You can make it do do that instead There's no particular reason why the computation can't do that by the way the implementation of an algorithm It's about ten lines of code The problem is that it's ten lines of nasty C++ code. I'll happily I think we're out of time But I'll happily show you the source code and and and we can browse around in it And you can see what it looks like because it is it is not visually complex. It's just a nuisance to work with Thank you very much. I am And here is a macadamia nutshell ball for you beautiful. Thank you. Thank you very much