 The Carnegie Mellon Quarantine Database talks are made possible by the Stephen Moy Foundation for Keeping It Real and by contributions from viewers like you. Thank you. We're super happy today to have Gavin Mendel-Bleson from TerminusDB to come give a talk with us about the system that they're building. Gavin is the co-founder and CTO of TerminusDB. He got his PhD in Computer Science from Dublin, Dublin City University, right? Yep. And then he did an undergraduate degree in math from the University of New Mexico. He was briefly a PhD student at Carnegie Mellon in Physics, not Computer Science, realized that was a dead-end job and then moved to Europe to do with PhD in Computer Science. So we also have a sponsor today for this talk. We are actually sponsored by the Stephen Moy Foundation for Keeping It Real. The foundation will be sponsoring all the quarantine talks we have this semester, so we appreciate them for helping us out. And this is a real sponsorship. This is not making this up. And so the way we'll do today is that, like always, if you have any questions for Gavin as he's talking, please unmute yourself, say who you are and where you're coming from, and then ask your question. And then feel free to interrupt as anytime you want. We want this thing to be interactive. Okay? All right, Gavin, the floor is yours. Go ahead. Thank you for being here. Great. Thank you very much, Andy. So I'm Gavin as introduced from TerminusDB, and I want to talk a little bit about building a native revision control graph database from the ground up. So first I'll give it a little bit of an outline of the motivation, and then I'll talk about the architecture. Okay, so basically the first question, I guess, is we're a graph database. So why graph in the first place? Now, there's a lot of people out there that are really into relational databases, and I definitely would have fallen into that class. So I know a lot of the positives, but there's also, there's some advantages to graph databases as well. So when I was at Trinity College, Dublin, we got a three million investment from Europe to do a large-scale research project on, which was partly the SESHAT Global Historical Databank. And that was a very ambitious project to store information about every polity in human history and various different data points about them in order to do analytics on the back of it and to try to find information about historical trends. So there's been a lot of publications came out of that, a lot of interesting research came out of that, including information about resilience of societies, things that I think are really relevant to our current climate, because climate change and other things can cause societies to collapse. And although people don't remember it now, it wasn't that long ago when we had very big disruptions. So the SESHAT Global Historical Databank was really a sprawling collection of datasets. There's loads of different datasets. It had very complicated and curated ontology. So a design of the various types of things they wanted to store, what they were related to, and it was all kept in a giant wiki, essentially with people, all these post-doctoral students and graduate students and undergraduates curating this massive interrelated information. And it made it very difficult to get information out of in order to do the analysis, but it was also it had a lot of junk in it because there wasn't a strong schematic control over it. But turning it into an RDBMS was really really hard to imagine because you would be talking about either having enormous numbers of tables or sort of simulating a graph inside of an RDBMS in order to store all the various different kinds of things that were in there. So instead of simulating a graph, we decided to just go with the graph in the first place. So the other thing that was necessary on the SESHAT Global Historical Data Bank was that they had to be able to change the schema quite a lot. So you really needed tools to be able to both visualize and to alter the schema as things moved forward and to lift the data along with it as it changed. So mostly adding new things, but also sometimes modifying information that was in there. So that's part of it. And coupled with that was a need to have revision control, essentially, because they needed some way of being able to see who it was, who added information because it was being added by graduate students, etc., and then sort of being able to verify it and then merge it into something later. So you can already start to see some of the revision control aspects at the very beginning of the project. So the second one that we started on was this partnership with Walters-Clear, which was looking at commercial intelligence. And again, we already had a graph, so we were sort of working in the graph space, but they wanted to scale it up to a lot more types of relationships between things. And in this case, you have another example where a graph is a very natural fit. So the kinds of questions they wanted to ask were like, okay, are these two people connected by some kind of relationship or some multi-hop relationship? And like we were asked to find things like, is there anybody who was at some time both a director and also adjudicating some case about the company or something along those lines? So conflicts of interest, etc. And the other relatively strange thing that they asked us that we searched for and found is if there were any shareholding cycles. So if a company owns shares and company A owns shares and company B owns shares and company C owns shares again and company A. And one might think that that is impossible, but in fact, there's lots of them in Poland as it turns out. And a lot of them look quite shady. When we mentioned it to our accountant, they said, oh yeah, the washing machine. I know that one. So you can imagine what you're washing. This was only in Poland. This is basically the entire every director person shareholding of every public company, everything that was registered through the court system, every like there was a number of relationships, their addresses, etc. for all of Poland since 1992. So that ends up being a lot of relationships at the end of the day. And the last one is that, well, it can be useful to use a graph when you don't know what you're indexing. So if you have like a lot of data in a sort of document format, and you just want to be able to throw it into something and then start sort of anywhere you like inside of the object and look for stuff, then then a graph makes a fair bit of sense. So those are the kinds of projects that we've worked on, things that use those, you know, things where graph is the natural answer. So the other question, I guess, is why a distributed revision control system. And here I think this, you know, we started with some ideas of revision control in the initial project. And as we move forward, we found that that's actually more important in a lot of areas in industry, and it's an underserved area. So one of the problems that we have right now is that there's a lot of curated data. The data is not necessarily huge. It can be, you know, maybe in the tens of millions or around that size of edges in a graph. And but there's this need to test it out, test some code on that. And then only after you know that it works with the code, deploy it to production. And so these sorts of pipeline operations are relatively difficult with databases as they currently stand. And it's like with Git, with CI CD type approaches, it's very easy to do those types of operations on code. And so there's a sort of hole in the market for those types of CI CD with data rather than code. And Git is really an amazing, I mean, it's just caused a revolution in the way the code was written. It's really been, it's so much better than when I first started programming. Okay, so now to go back to the architecture of like how we have tried to solve these coupled problems at the same time. So we have a relatively unusual architecture and, but it has some advantages. And I think maybe you guys can, you guys can inquire about it. So we represent our data as graphs, as I said, and those are stored as triples. So we have a some kind of triples notionally that describe the graph. So you decompose the graph into these triples as zero to a by lift nozzle to S1. So it's a labeled graph because each of the edges as a label, and all of the nodes have labels as well. Okay, so and we represent these graphs with succinct data structures. And so a succinct data structure is a data structure with some kind of access complexity that is designed to approach the information theoretic minimum size of the object. So for instance, the labels that we store, we store as a front coded a plain front coding dictionary. And then we use log arrays in order to store the information from the graph. And that creates a very compact data structure. So in order to update the data structure, because it's so compact and because it's not really designed as a pointer tree, it's just a plane. What we do is we actually store the deltas as separate objects. So we have like all of the subtracted triples and all the added triples, and then you have a chain. And because you have the these chains, you end up with something that looks a lot like a revision like deltas in git. So the head of your database is just where you start searching. When you come in with a search, you'll you'll look for 987. For instance, if you search for 987, it'll immediately return. If instead you search for XYZ, you'll go down to the furthest plane down, you'll find XYZ is there, and it'll be subtracted as XYZ here on the way back up. Now actually, if you're going on the way down, and you know that XY and Z are all bound to definite, definite values, you can stop on the second plane without going all the way to the bottom. But if you don't know if you're actually searching a generic unbound variables, then you have to go through the whole thing. So in order to point to the head, we have something called a label. And the label is a file that points at a layer. It tells you what the current head is. So that means that you can do updates in a safe manner. So you can do a speculative update, try and see if it's valid. And then only after you commit, you can move the head in a single atomic operation. And these things can happen concurrently. You can have as many searches you like simultaneously. There's no locking that's necessary there. You only have to worry about when you move the head itself. Okay, so we're cooking. So these labels are, it's just the, so you have like the root of the, there's a master back of the points of whatever the root of the head is. So you plot down a file, you put all your changes in there, and then you do a compare and swap on whatever the head to know what's your new file. That's it. That's it. Exactly. So, okay, so, but that's what we wanted for, for instance, for the SASHAP project. That wasn't a sufficient complexity. We actually needed more things at once. So we store our schemas as triples as well. So our schema is a graph, and that graph describes what's in the instance graph. And the instance graph is constrained to always conform to the information in the schema. We also want to be able to keep information, metadata, about the commit history, the timestamps, authors, messages, and that kind of thing. And then you also, you want to be able to store information about branches and tags and versions and such like that. So we have an additional, so we developed another approach, which is that we allow graphs to store information about what layers they're talking about. And then you move the head of that graph after you've already put into place all of the subordinate graphs in one big operation with the head again. So that you can do a, you move the head of the individual graphs that are associated, the schema graph, the instance graph, then you move the commit graph head, then you move the pointer from the label. Okay, and I'll have a picture for that, for you, Andy, to make it a little bit easier to see. Okay, so this is the simplest. So what we have is, we have a label object, it's called myDB label, and it points to the layer. And inside of that layer, there's a label object as well, which points to another layer. And when you update, so you add a new layer that describes this movement that you've added another layer, and then you finally move the database label at the end of the day. Okay, so that object which keeps track of which commits have been made is called the commit graph, and it is actually a graph itself. So why a commit graph? Well, we have to keep information about these commits, we want the history, all of those things, we want to keep the metadata about it. So the commit graph has the commit objects, it has the named graph objects which are being pointed to, it tells you whether their instance schema or inference graphs, so I haven't talked about inference graphs, but inference graphs allow you to also do automatic inference on some of, by using a rule-based system. And then you have layer references, which are pointed to by the name graph, and those layer references actually point to the on-disk layers. Okay, and then we also have some technical facts of RDF. So RDF is a way of describing all of the labels of your system, instead of using a sort of a nominal pointer or something like that, you use an IRI, which is supposed to give you a way to have a sort of semi-readable but unique object identifier. Okay, so the commit graph, this is sort of a pictorial representation of what it looks like, F, E, D, C, B, and A are all commit objects, they point to their parent, and each of them has an instance and a schema pointer, and that graph itself then has a pointer to its layer object that's associated with it. And this entire commit graph itself also is the head of some chain of layers. Okay, so the commit graph is also a validated graph, it also has a schema, and all of the the schema works in the same way that it does for the lower level, or for the databases themselves. It's actually shouldn't be necessary because hopefully everything that we put in the commit graph is directed by our system, and our system never makes any mistakes, but actually sometimes we've made mistakes, so at the moment we leave schema checking on because it doesn't seem to impact performance too much, and it has found bugs in the way that we were entering information, so you can keep constraints. The schema in your world is just like you have a triplet. That's right. Okay. Yeah, so the triples, so we have like there's a graph that represents the constraints that are in that have to be satisfied by our instance graph. Okay, but again, those constraints are just like it has to be a triplet, right? There's nothing, like what else is there? Is it like a schema is not including the type of like what the way has to be? Yeah, no, it can say things like you can only have so many of these property, or you know you have to, you can't have not have this property, or you know these are the properties you have, or things like you point from this thing to that thing, so it's a high level description of what kinds of things can be pointed to to give constraints on the shape of the graph. All right, cool, awesome. Thank you. That's right. Okay, and the commit graph itself is queryable in our query language, so our query language is a it's a sort of a data log like language that allows you to query the graph, and the commit graph, all of the internal queries inside of the database also use our query language in order to to operations on the commit graph. So it's quite meta circular. And then we have one layer above that is a metadata graph, which points to the commit graph. And the reason for this is if you have multiple, because like in Git, you often want to have remote objects, you want to keep two things in sync, you want to be able to pull and push them, then you have to have some way of reconciling the history. So we have another graph that points to all of the repositories that are currently being talked about by the database. It also it will have the remote URL, so it knows how to send and receive from the remote object. And it will also have other information in the future to do pipelining operations more effectively. Okay, so with the metadata graph, then we're able to do things like clone, push, pull, fetch, etc. And those operations are implemented by actually transmitting the layers from the commit graph across the wire. So finally, we have one other graph, a system graph, but it's not in the hierarchy actually. So what does it contain? It contains information about authentication and authorization and capabilities. So it decides whether or not you're allowed to do some kind of operation on some certain set of objects and controls the query to make sure that you don't do anything dodgy. And it also keeps track of all the database names and who the organizations and authors are that are associated with it. So, but the system graph is not, it's not at the top, it's not actually above the metadata graph. And the reason for that is we don't want to have like you want to be able to move all of the databases independently without actually having contention for the system graph. So that's the reason that it's separate. So the overall architectural picture is here. You see the system graph kind of points, it has information about the name of a layer. And then that name then gives you the head of a metadata graph. And then that gives you, you have like local and remote, the local would point to the head of the commit graph. And then the the commit graph, you'd look for maybe main branch there. And then you can find the head layer for the main instance graph and the main schema graph from there. And that gives you sort of the high level overview of what the system looks like in terms of its architecture. So maybe I'll just give you a peek at the the system. So real quickly, so like the the, this is the physical storage layer. So like the use of that writes the query over the rest API, they don't know about any of this, right? No, you just have revisions. This is internally how you're maintaining this. That's right. That's right. So the only thing they'll see is I have a database, it has some branches. And then, you know, they'll they'll see that there's revisions for that branch. And then, you know, you can go in and you can write queries and the queries will look at the instance data, or you can ask specifically to query the schema information. But mostly the schema information is either you want to add something to the schema, take something away, or, or you you want it to constrain the information that you're routing or taking away from the instance graph. Okay. And then the way you would envision someone using this is like, it's not like you're doing fine-grained transactions, you're updating a single record. This is like, here's a batch of updates, apply them, and then that sort of gets tracked as a separate commit. Because you have to sort of update a lot of things that do update one record. That's right. So yeah, so you can imagine, you know, you might add a whole bunch of things at once. Although, you know, it's quite easy. I mean, it's not such a big deal to add just one object at a time. So if you had, for instance, in the Seshack Global Historical Databank, it's a very complicated schema. But the, and you know, you say, okay, well, all polities in human history is not actually that many polities in human history. And if you're only tracking, you know, their population, the area with their religion, you know, that kind of stuff, it's actually not that much information in total. So especially, you know, a lot of times the information is most relevant at 100 year intervals or something like this, then or 30 year intervals is probably better because you want to get closer to generational cycles. But that's, that's the kind of, if that's the kind of granularity, then it's quite easy to envisage that all of these things are being updated. So the, the, in order to increase performance though, like, so if, if you have a large number of revisions, the queries are going to get slow. So we have something called the Delta Rollup. So you can ask for a certain layer to have an equivalence layer. And then that, then you can search in that equivalence layer, if you just want to do a query on the database at that point. That's like a compaction or a squashing? That's right. It's a compaction. Yeah, exactly. Okay. So this is, this is a, when you download TermsDB, you go to your local host, and you can actually see a sort of a front end to the database. So you can create a new database, say, we just go here and once, let me try to get my notes really quickly. The mascots are a cow. It's a cow duck. Okay. I wasn't sure what that was. Okay. Yeah. So we actually, there's a story behind it, but it has to do with the graphs. So you see that? I see, like, yes. Okay. So, okay. I see bank. Yep. Okay. Sorry. Okay. Bank. Okay. So here there's a, we can ask for it to be shared. So I'm currently logged into Terminus Hub, and Terminus Hub gives you a way, it's like GitHub, so you can push and pull from there. So you're probably familiar with that from DIL, because they also have gone that direction. And then, and you can add a banking example out of description of the database, create new database, and then it goes ahead, creates it on Hub, and then clones it to my local machine. So then I can do push and pull operations to it. So I can go to, for instance, I can go to query, and this gives me a query viewer, and I can say, okay, I want to add a schema to this. So I have a schema written, and what this says is I want to add a bank account type of documents with the property owner, and I want the data type value to be a string. And then balance, I want the balance to be a non-negative integer, and I want them to be constrained to be of cardinality one. I run the query, and then it does those inserts. And if I go to schema, and I look at the owl, it shows, it actually writes this RDF owl information for me. And this is, I'm not sure how familiar you are with sort of semantic web stuff, but this is a language that was used for describing ontologies. But it's nice because it can give you information about like cardinalities, etc. And you'll notice that I made a mistake. So I can either edit through the query system, or I can go in here and just edit the file. I called the label of balance, or the label of balance was owner there, which is wrong. So I could save the changes, and it actually updates the schema. Here you can see which graphs I have. I have two, they're both called main, and one of them is an instance graph, and one of them is a schema graph. So I can then, once I have a schema, I can go in back to the query browser. I can ask to add some triples to it. I run the query, and it adds the information to the database. Then if I look here at documents, you can see I've added this document ID mic, and it's a bank account object. So that gives you an idea of like, you know, how the sort of approach of adding information goes. However, I might like, I might want to edit some of the information. So I might say, okay, I want to search for mic, find out what balance mic has in his bank account, and then I want to get rid of that balance, and I want to add a new balance. So I try and run the query, and it says, actually, that's a violation because this is a non-negative integer, and it's now negative seven. So you can't do that. And it throws back, it actually sends you back a JSON-LD witness object. So it's, you can see the information here, but if you're programmatically looking at it, it's quite useful because you get information about what the precise error in the schema was, and there's also a schema that describes how that, what that error is. Okay, so let's see. So if I go back here to the bank, I can go to manage, and I can say, create a new branch, and I can call this new branch, branch office, create the new branch starting from the time now. So you can edit that by clicking this time object. The time object will look at the past history of commits, so you can move to past commits and branch from them. So I can create this new branch office, and now when I go to the branch office, right now I have the same documents in them. However, if I do, if I add some information to it, say I add another guy, Jim, run the query, then I go to documents, you'll see I have Mike and Jim in there, but in main, I only have Mike in the branch branching system. Okay, so then if I go to the query system here, and then I say, say I add one more person, so I can say I would like to add Jane, I run the query, and I get a new update, and then when I go back to the branch, sorry, I was in main when I did that. So what I will do is I will do a rebase operation starting from main and merging into the branch office. For some reason I can't do that. It's okay. Yeah, sorry. Mostly use the programmatic interface. That was actually my question, but like people have get right, like whatever message you told you how to fix it. Oh, yeah. Yeah. Sorry. I can't see it. Anyhow. Yeah, no, it's actually, I mean, what I would really do, if I was using, when I'm using it in anger, sorry, my zoom thing is not allowing me to release. You're trying to like un-share and then add it back. Yeah, I was trying to un-share this, but my, at the top should be like stop share. Yeah, except it's, I tried to grab it and it moved off of my screen. So I actually can't un-share it. Can you see that? I see that, yes. All right, that's better. Okay. So what I would actually do is write something in Python in order to do it. So for instance, you know, you would actually implement all of these, you do these operations, creating the graphs and doing the execution of these queries from Python directly. So it's a lot more convenient than using the front-end interface, but it's nice to be able to use the front-end for browsing. And it's actually not bad for query, if you're not, if you don't have to, if you're not trying to do like programmatic updates or anything complicated. So, I mean, the dope guys, again, not to do comparison, but they had basically, you know, a clone of like the Git command line tool. That's right. Like, is there something similar like that for you guys? Or is it the Python or the Web Interface? Right now, it's the Python and the Web Interface, but we are definitely, we're going to make a command line tool in the near future. Got it. Yeah. It's not too complicated to do on top of it. Yeah. So, and that sort of concludes it. So, do you have any questions or? Awesome. All right. So I'll applaud for for people because, again, we're virtual. So we, and if anybody has any questions, meet yourself, and then again, say who you are and where you're coming from, we can do this for a bit. So, if anybody has any questions, open it to the floor. Yeah. This is Joel Bender, Cornell University. I have a relatively large batch of RDF content to upload. Is there an easy way to do that? Yeah. So, I mean, it's, you can just upload it through, like here's a fast TTA load thing. So, just create a database in Python and then, you know, open the turtle file and then do update the triples or insert the triples and it'll throw it in there. Okay. And the next question is, do you have a connector between RDFLib and Termos? No, we don't have a connector right now, no. Okay. Yep. Okay. Awesome. Thank you. Next person? So, okay. I see, you know, obviously you come from the ontology world, and therefore maybe the, the, the, the all stuff and for you is sort of second nature, but I, like, are you, sorry, I said, like, by not supporting SQL and not, I mean, or even the other, you know, Gremlin or the Cypher or the other graph languages that are out there, like, how does it, it would be sort of, would the kind of person using Terminus be very self-selecting? Like, it wouldn't be your average programmer, right? This seems like someone has a very serious problem that like, oh, I have some RDF data that they, they want to store. Whereas like the Neo4j's of the world are trying to be sort of like the graph database for everybody. And my question is, like, how much of what you're describing, like, you could have a more, like, sort of user-friendly, but something that, like, made people more familiar with than the ontology APIs or query languages. Does that make sense? Yeah, absolutely. So if, if you wanted to just store, like, arbitrary graph in Terminus, you can just not have a schema. And if you don't have a schema, it won't check anything. And then you're very much like the other sort of graphs that are out there. The problem with that is that it's really, like, with, with the relational database management system, you get, like, tables. And tables are really nice in that they give you sort of constraints on the kinds of information that get entered. Now it doesn't perfectly stop you from doing terrible things, but it does help to maintain some level of consistency with graphs. It's really easy to end up with just complete spaghetti, because you can attach anything to anything else and then point anything to anything else. So if you don't have some kind of constraints on what's pointing to what and what it's supposed to be pointing to by, then you get some negative impacts. The second thing. My question would be for that one is like, how often do people show up with you, like, show up in the garden? I'm interested in using Terminus DB. I have a graph database and then they have a schema already. Like, I think nobody, nobody, right? So it's a question of like, okay, so how do you, how do you make a schema? We have to kind of teach people to do it. But like, if you saw the schema that I entered there, I entered it in a query format, it's not that hard to describe which properties you have an object that has these properties that points to these objects. It's not too hard to write it down. I think it's not too hard to get people to start thinking in that way. And I think like, if we're going to start using graphs seriously, they're going to have to start thinking in that way, because otherwise, you know, it becomes pretty unmanageable quickly. But the other advantage of it is that if you're careful about describing your schema, then you can, you can pull objects out very easily as documents. And that can be really helpful when you're trying to, you know, do use it in a programmatic way. Because sometimes if you have a graph, and you try to like, if you get back a table in SQL, it's kind of straightforward what you've gotten out. But a lot of times you want like a fragment of the graph, and figuring out how to get that fragment and then get it into a sort of key value pair thing is just extra work. So it's nice if you can get it out. Like, people are quite comfortable with document databases, I think. And so it's nice to be able to have a document database that also you can search as a graph. Got it. And I interrupt you. There was something else you were going to bring up as well. I don't really cover that. What's that? Like, you said two things. And before you get to number two, I interrupt you to ask my other question. Oh, I can't remember what number two was. Okay, that's all right. And then like the WOQL, like, that is like, is that like, is that query language you're specifying the exact steps of whatever algorithm you want to do? Or is it still kind of high level? Like, so you can do like nearest neighbor search or, you know, do a path traversal? We have high level, we have high level things like path traversal, and we'll be adding more things along those lines as we go forward. But yeah, it's a high, it's both high level and low level. At the moment, query optimization is almost non existent in the system. But we, we intend, we intend to add it as we as we go forward. So that's, so really, it's, you're kind of giving the layout plan of how it does the search, unless you use one of the higher level predicates like, like path, path queries. Okay. And then WOQL is specific to terms to be like, that's your language that you have added? That's right. Yeah. Yeah. So we can, you know, it's a, it would be possible to expose sort of other query languages on top that are especially the ones that are sort of or DF designed. But I think RDF's not super popular at the moment. Anyhow. So, so I'm not sure it matters that much. I think the main reason that somebody's going to come into this is they want a document store with revision control, or they want to be able to have some kind of view into like doing pipelining and that sort of thing. And so the query languages is really not going to be the main focus of the reason that they start using TermsDB. But we think like, I think personally that data log-ish type query languages are just a lot better than the approaches that they're trying to use in a lot of the graph community. So I mean, like Sparkle is like kind of trying to fit data log into an SQL syntax. And then you end up with something that doesn't compose very well and is really awkward. And then, you know, the other ones where you have all these sorts of iteration type approaches, like, I don't know if you've looked at TigerDB or any of those. TigerRap. TigerRap, yeah. And it's, I find that their interfaces just are not, don't have the beauty or simplicity of data log. So I think whatever wins in the end is going to be some kind of data log-ish looking thing, just because it's more convenient to express recursive algorithms and stuff like that in it. I mean, this is more of a comment about the graph database market in general. But like, the danger of everyone sort of coming up there in languages is that you may repeat this, the issues that the object-oriented databases have in the late 80s where everyone had their own query language and no one could ever standardize and just know one thing that unified everyone. And therefore, the ecosystem that could be built around, you know, a single language that you're thinking benefit from, you know, is, it just doesn't, doesn't get cultivated, right? Like, if I have a SQL, then in theory I can support Tableau, MicroStrategy, whatever Crystal reports, you know, no matter what database I'm using. But if I have my own query language, I don't get that, right? Like, you don't get people writing stuff that can use your database for free for you. You have to build everything yourself. Yeah, yeah. I mean, the fragmentation is a problem that the graph database community in a similar way that it was there. Like, we've tried to be sort of standards compliant on most things that we have, like, you know, it's Al and RDF and it, we can generate valid Al and RDF and we can dump it to those and read from those formats. But in terms of like Sparkle, I mean, we can, we can expose a Sparkle endpoint with very, very little difficulty, but it's just such an awkward, what's that? Is it going to use Sparkle? No, not really. I mean, it's not very, well, I mean, they do in some certain circumstances in, in, I guess, in the academic community to some extent, but it's not, it's not widely used. So I don't think it's necessarily, you know, worth supporting that. And the other graph query languages are even more awkward, I would say. So I mean, you can standardize on something really awkward, but then there's also a difficulty with that. But I mean, the beauty, the thing about SQL is that it has a simplicity of design that's quite powerful and it's very composable. Like it all makes sense together. It's sort of a, it has a self-consistency to it that's really beautiful. And we need something like that, you know, we need something that has that sort of simplicity of design. And I don't think that the other graph languages really have that. I think whatever it's going to be, it's going to have to be something like that. We would like to see more people use Wackel that it not just be for TerminusDB, but you know, that's another thing. So I have a sort of follow-up question on the relationship between your schema language and Shackle. You know, I mean, I personally find Shackle a lot easier to deal with than Owl. And it seems like you'd be able to, you know, write, be pretty straightforward to write a translation from some kind of Shackle shape into a Python application for me, because I'm a Python programmer, but I don't know about, you know, your custom language. Right. So, yeah, no, it would be relatively easy. So I mean, Shackle has a couple of things about it. So one, so Owl is sort of designed as an ontology language, not as a constraint language. Shackle was supposed to, is sort of the constraint version that's sort of related but different. And Shackle, the problem there is that now you have two languages to describe what you're trying to describe. So we took the, we took the approach of just using Owl and then treating it with a closed world interpretation as a constraint language instead of using an open world interpretation. So any information that's stored in TerminusDB and is exported, it will still, it will be valid Owl, but we're more strict about it so that you get the kinds of constraints that you would get from Shackle. So it's not really necessary to use Shackle. Now, when we were looking, we kind of looked first at Shackle for a little while, but it was, when I was looking at it, it was defined in such a way that like there were issues of self-consistency with the documentation. And I noticed after, like I sent a sort of note that there were problems. So they had like recursive operations with cardinality that were defined in the documentation. And then that, it actually ends up creating a non-monotonic functor so that there's, it's not a sound object to, you can't actually say whether something's in or outside of that object. And those sorts of consistency issues had not been worked out yet, whereas Owl had sort of a much, much firmer foundation, logical foundation. So it was easier to implement in a way that made sense. Yeah. And for some reason, they decided to allow you to embed Shackle queries, or Sparkle queries inside Shackle, which is like, come on, really guys? Yeah, yeah, yeah. Yeah. Okay. Any, any last question? Before I last, well, I have two more questions. Let everybody else go if they want to go. Okay. So the first question is, what is the background or what is the backstory of the cow duck or the duck cow? So it was a, the cow duck is the love child of the cow and the duck. So we had some, one of our first demonstrations of the, of our rust back end was that a duck fell in love with the cow. And they, we just had their relationships modeled in the graph. And so that's, that's how we ended up with the mascot. Got it. Nice. Okay. Beautiful. All right. And then my last question is the one I've been asking all the, the database friends coming, giving talks for yours. I actually suspect I know the answer, but I wouldn't want to hear your opinion how it is. I guess my question would be at this point, how, how stupid are your users? Like, are you ever surprised with what kind of problems people hit with your system because they're abusing in ways that you've never thought of? Well, I'm never surprised when somebody abuses the system. I mean, we've had a, like our community so far has been really intelligent. And I think there's, there's an attractiveness of the idea of revision control graph database. And I think the most surprising thing has not been, like, I guess it doesn't surprise me too much when somebody wants to do something that's abusive to the database, because that's exactly what I would probably do if I started using one, try to figure out, you know, what happens if I try to mistreat it badly. But no, some of the really clever things that people have, have come up with, I think is more surprising. For instance, people have been like, looking at the commit graph and looking at it as like, instead of trying to just see it as a revision control system, thinking of it as graphs that change over time and then doing queries across the various graphs. So you get sort of a temporal graph, logical questions that you can ask, how did this thing change into that kind of thing? Not really struck me because I hadn't really thought of that use case. It was really, we were just thinking about it in terms of revisions. Yes. Okay, this is, I actually was going to guess that your, your, your customers are a bit more intelligent than other baby data customers. Because like I said, I think the RDF, the ALA stuff is a sort of a bit of a bar of entry that should have to come in and like, people aren't coming to you because they stumble across like, I want something that, you know, I want to graph database. They have very specific ideas on maybe what this thing actually do and they're sort of self-selecting. Yeah. So yeah, it's going to be people who really feel like they need some kind of solution to a kind of a problem that maybe they need a revision control graph database system. And if they come for that, then there probably are already much more sophisticated in the first place. Yes. So hopefully we want to move to a broader selection, but we're going to start from there and then, you know, start making it easier for people to come on board with simpler things like, you know, automatic CSV upload where you just click and it goes in and then Excel, for instance, because it's quite easy to write down an ontology that will describe what an Excel spreadsheet looks like and then you could just throw it into the graph. Okay. That's awesome. Yeah, this is, this is about what I thought it was going to be. Okay. So again, Gavin, I appreciate you doing this. I've tried to announce that you're actually in Vienna, in Austria. So I don't know the time difference it is between Ireland, but it's probably a little bit later. Yeah, it's 11.30 here. So we appreciate you staying up and thank you for spending time with us. It was a really interesting talk. Thanks for having me.