 show me so So I know kung-fu whoa Show me so today we'll learn how to simplify and Power up your Rails apps using a graph database But first let's look at what the current database landscape looks like We're gonna start with the lower complexity so by lower complexity. I mean not very connected database but high volume like a key value store and This is something like you know react Redis man-cached KC that kind of stuff and it's useful like session data or monitoring or you know huge catalog stuff, and you have columnar Which is a little bit more structure something like H base or Cassandra and I think H base was made by Google So you use the product search engine stuff like that and then document databases, which is a little more structure But you then you have the notion of collections MongoDB and couch DB and you use those for like health records and insurance and stuff like that and then the the one that we're most familiar with is the relational database and This is you can use it for something simple or something very complex And you know I have my my sequel or postgres Oracle And you can see the stuff like banks or I could say anything. So the question is where where does a graph database fit here and the answer is a graph database is as general purpose as a relational database, but you can you can actually treat it as a Key value store or you know as each node to be a document so in a sense a graph database is like a super set of all these databases, which is pretty cool and Of course, I'm gonna talk about a graph database. I'm talking about a way of storing That a data structure in a graph, right? So you're familiar with linked lists or you should be anyway, if not, that's okay I wasn't familiar with it until I went to college Trees and then a more general purpose data structure is a graph and That allows you since it's so general purpose that allows you to store that and However, you want to it's pretty cool So let's go Let me show you just a quick example of a relational How you would do so in the relational database So say you have a user's table then you have an interest table and then you want to you know see What what what user has what interests, right? So you will say I'm gonna start With this user I'm going to look up this user ID in the user interest table and then correlate that to find the interest very straightforward very familiar with this so let's go from thinking relational to thinking more like relationships and We what we do is we kind of spread these Nodes out as we're going to current nodes and we're going to want to connect them with these Relationships or edges and then we're going to say I want to start with this user and I want to know What this user is directly connected with and those are my My interest and that's it see a graph database Focuses on the relationships between the data and rather than the commonality among values, you know foreign keys and stuff like that So in this message is very naturally with the way that we think about Or the way that we humans conceptualize data right we think of families or families as trees we think of our friends as a graph right so Most of us don't imagine personal relationships as self-referential data types, right? We don't think about oh, this is my foreign key over here, you know, you're you're thinking more in terms of Of a graph right so and what I've been describing here is Technically a property graph in a proper good f has nodes it has Relationships and then it has properties on both and that's important because that makes Both the nodes and the relationships first-class citizens right so you can put a key value store And that's the property in both a relationship in a node Relationships are also typed so a relationship Could be something like I follow you right and then a different type of relationship could be you block me Right because I'm being annoying or whatever so so it's typed and it's directed as well more on that later So I'm gonna go, you know, you may be thinking right now man Lunch was very good Or you may be thinking I want to know more about graph databases So I'm gonna use a graph to describe a graph a graph records data in nodes It records data in relationships a relationship organizes a node a node Nodes have properties and relationships have properties right so and you can see that there's also the relationships are directed here So they have a sense of direction In most of the cases you don't care about direction, but again, it's it's it's chemo is it's it's the It's up to you how you how you structure stuff in a peptide of this So again, what what could we use this for I've been asked this question a lot And it may or may not be obvious to you that you know, this is very much so like a social graph, right? I follow you follow me. It's it's very obvious, but there's so many more Things that you can do and this is just bit. I mean barely the tip of the iceberg Monitoring routing and by routing. I mean something like like network packet routing You can you can say where has this packet been and then you can basically track all the points in that and Your brain genealogy all kinds of stuff. It's it's very cool And what's most exciting about this just up up to a few years ago There wasn't really an easy way for us to kind of get a hold of a A graph database that we could deploy to easily and of course in came Neo4j the J is for Java and It's a property graph again. It's nodes and relationships and properties and both It's it's perfect for like complex highly connected data, right? So that's why they call it a whiteboard Friendly so basically what they say is like if you can draw it on the board Then you can represent it and you can capture it in the in the database and store it that way Which is pretty cool. It's very scalable. I think it's it's supposed to be able to take 32 billion nodes and 32 billion properties Sorry, 32 billion relationships in 64 billion properties Lots of billions of stuff there that you can fill out so Very cool. And then the most exciting part is that it has a rest API and therefore it also has a Neo4 sorry a Neo4j Heroku add-on so you could right now spin up a scenario app or spin up a Rails app and start messing with the stuff and you know kind of tinkering with it and deploying it So that's pretty exciting now when Neo4j was created. It wasn't created, you know as an academic exercise It was more created out of necessity these guys in Sweden Had a relational database and they were trying to solve a problem and what is decided to do was create a sample Social graph to kind of track their progress as they were going to build it right And basically it was going to be a thousand person social graph with an average of 50 50 friends per person And the query was going to be this is does does a path exist between person a and person B right and and this is basically The the small world phenomenon, you know that you can reach anybody within six degrees or the Kevin Bacon path everybody's familiar with that Okay, make it six degrees of Kevin Bacon is awesome and not to and not to Say that you couldn't do this in sequel There's you know lots of ways that you could do this in sequel like recursive sequel to be very inefficient But you could do it you could do it using a closure tree but it's it just adds a whole bunch of overhead and space that You know, why would you do it if a graph database does it for you and naturally right? Before I move forward I wanted to show you a way to do this in postgres using recursive sequel it's pretty ugly and Let me point out This is teamwork. Oh It disappears Union all should be like the worst thing that you could do So anyway, I'm gonna remove it from the screen because it's embarrassing Yeah, don't do that so So here we go, right? What were the benchmarks that these guys were getting so for a relational database? It took on average, you know to do that query it took an average two seconds two thousand milliseconds And on Neo4j took two milliseconds, which is pretty awesome and Now whenever they increased the sample size to like millions of records The my sequel database didn't even finish right and I'm not saying that the sequel that I showed is what they used Because they were using my sequel and the sample that is was postgres But the point is in Neo4j is still with the millions of records still to two milliseconds, which is pretty cool And if you ask me It's dogs all the way down, baby Wait Or is it lions all the way down? I'm confused Your loss care and confused me anyway, which brings me to my next point. It doesn't really it does My next point is is interacting right so the key to understand why it took The same amount of time in a in a with millions of records versus, you know thousands of records comes to us By the way on how we traverse the graph right and so bear with me with the syntax here And I'm gonna talk to you about it later. So the syntax says start with n You're assigning and the results are you're finding node in the users there is a An index with a key value of name and Morpheus, right? So you find Morpheus and you want to say I want Match in and then everything that's between you don't care. So you don't want his direct Connections you want his friends of friends that's a foe there and then you want to return that and what that gives you is You know his immediate network So it doesn't matter how many how big this is it can be 50 or a hundred His immediate network can be 50 right because remember in the oh, I'm sorry I guess I forgot to mention on the on the on the query instead of doing 60 degrees of separation They were doing just four degrees of separation. So it's it's a finite number So you're not gonna go more than four deep right on on finding this stuff So let's say Morpheus's network is just a hundred because they're all directly connected It doesn't matter how big the graph is it can be millions of of notes or it can just be 200 notes Since you're starting at a note and then you're you're walking the graph or traversing the graph from there It doesn't it doesn't care about how big the rest of the stuff is so that's why it only took two milliseconds What you saw there is Cypher, which is the query language. It's pattern matching it has a declarative grammar with classes kind of like sequel and It you use it to mutate you can create you can delete and update You can do aggregations and ordering all that stuff and I'm gonna show you some examples in a minute Let's start with a real example with a real graph Everybody's familiar with the Matrix movies, right? Yeah, believe it or not. There's somebody a hash rocket. I'm not gonna name names that Has never seen the movies Which I was like what? really anyway, so Neo needs to save The world and he needs to find the keymaker, right? So you're gonna start at Neo and I say Neo knows the location of Morpheus who knows the location of the Oracle who knows their Location of the Merovingian who knows the location of the keymaker This is a pretty straightforward graph and pretty boring actually so let me throw in some more details in here Because again, we can add properties to both. It's a cool stuff. So now we know that Neo is a human who knows the location of Morpheus and the locations of Devok and Esser and the disclosure is public And then Morpheus is a human who knows the location of the Oracle who's a program, right? And the location is the Matrix and the disclosure again is public. So the Oracle knows a location of The Merovingian which is in Club Hell. That's what it was called. I'm not even kidding you Club Hell In the Merovingian, in the Merovingian knows a location of the keymaker which he's holding him captive in a Windows server May or may not have made that up But but anyway the disclosure is secret because he's holding him captive, right and so You know, Neo's mission should he choose to not a different movie So here's how you would create the stuff Are you create you create a node in Cypher like this very simple? It's good. It's gonna look very similar to to sequel So Neo and then you you you assign that the only reason we're assigning it to Neo so we can return it You could just say create and then that That hash and then they will just return the ID and then the has to just create it And that's how that's how easy it is to create a node in the console To create a relationship again You find the nodes that your industry does the start and the end node and then you say create you say Neo No location of you pass in the properties that you want and they say in Morpheus And then you go to go and then you return R and then it will look something like this I guess it gives you The nose location of is a type of relationship the ID and then its properties pretty straightforward Yes so To to now to query what you we just created we want to know Nio's immediate network right so basically his first degree network you would say excuse me You you find so you got to start with a node remember and then you say I want to match Everybody who is Nio's immediate network and then it will return Morpheus because if you remember That's it now This doesn't mean that you have to just be it can just be one linear thing for simplicity I just left it like this but you know out of out of Nio Nio knows the location of Trinity, you know because they're you know they hooked up and whatnot and Nio knows the location of you know tank and all the other guys are cool in the first movie and Then whatever happened on the other two. I can't remember but you know Nio in all so there's more so this this This result would probably be larger you have more nodes here to to know who who's The Nio's first an immediate network right cool, so To get the path to the keymaker So if you want to do like a straight line all the way to the keymaker you would say something like start with Nio Right, and then give me you do something like this cool-looking syntax here They say no location of star one dot dot four So you want to go for deep because if you remember it's one two three four all the way down And then and you and you can return just parts just like a property from the relationship And it would look something like this right so this becomes Nio's to-do list of who needs to go find Right Morpheus oracle mervingian keymaker save the world check. We're done And what you've been seeing here is just a weird screenshot of the Neo4j console web admin console and you can like write all this stuff and create it all by hand if you wanted to There's also on the tab next to it, which I didn't have a Site for that, but you can go check it out. It is local host on your local host You can install this very easily with brew or you can install it with some other gems You can visualize the data, right? It's pretty cool, but I'm not gonna go into that I am gonna go into how you will implement this using rails, right now If you're using JRuby, then Neo4j that are be there's a gem that basically replaces Basically replaces active record, but at that point what happens is that your the graph database becomes your only data data Whatever you call the thing database However Now Neo4j implements Neo4j that are be implements active model and parts of active record So, you know, it will be very similar like has and and all that kind of stuff. You can do all that stuff and that's cool But what if I told you you can have both Because in most cases realistically you're gonna have a Polyglot database environment, you're not gonna want to just say hey everybody you're gonna only use a Graph database for everything now forget about anything else because Realistically, that's not it's not practical and you want to leverage a graph database for it's good for in the case in my case we just We have a client that we are in close beta right now that they needed to to track degrees of separation so Postgres was our system of record and then we had the things that matter to us synchronizing With with a Neo4j database and then that way we were able to just use The Neo4j the graph database for you know measuring the degrees of separation and then the rest of the stuff the bulk of the stuff was still happening with You know postgres and stuff like that so So yeah, so if you're not using JRuby, which Is is you know, it's been my experience that I you know I haven't been able to use JRuby yet on a on a project then the recipe is what you want to use You know you want to hit the recipe I and there's and there's a handful of Ruby gems that that wrap that the recipe I The first one that we started with was neography, which is a very thin wrapper There's another one called architect for our and then of course my favorite one is called keymaker Mostly because I wrote It's actually still a work in progress is pretty in its infancy we extracted it from the project that I was talking to you guys about and then and What it aims to be keymaker aims to be a multi-layer Ruby wrapper, right for Neo4j in hamburger So the first layer interacts with Neo4j REST API raw request So I'm talking about like the everything basically tries to implement at the lowest level every request that you can make To the Neo4j REST API, right and then the second layer binds those raw requests Into Ruby objects and then finally the top layer Implements parts of active model or implements active model and treats those notes and relationships as bonafide Ruby objects now so as as you go higher in the stack then then it becomes a little bit more rigid, right because at the very Lowest level a graph database you can it's schema. So you can do whatever you want The properties are whatever properties you want to do and But as you go higher up in the stack then it becomes a little bit more structured and that's okay You know because you want some structure hamburger Sorry, I was just random thought so what would it look like and this is again more code. Am I boring you guys with code? Cool. All right, here we go So the program because we have we have programs and we have humans you would say include include keymaker node Right, and then you would give it a property. So now this this object can only have a name Even though you can you could technically create more properties You're now making it a little bit more rigid or more structured by giving it a name Then you say I want to create an index called programs And then I want that to be on name or you can do on name and on email or whatever else right and now this is a very Weird and long way. It's like a find by sequel type thing that it's a real weird way to creating a Relationship and I'm going to show you a different way to do it later But I wanted to show you that you could actually execute the cipher So you would write the cipher like I was showing you earlier and then You can you know, you see the name over there That's how you pass stuff into it and then you you pass a it's very similar to to these to the The hash syntax that you can pass in when you're doing a word class in Active record so then you execute the cipher and then it you know it creates the the Relationship an easier way to do this is to dig The cool thing about the key maker is that it allows you to use all the parts and at every point in that in that in the Three layers you can go and dig down and use this lower level call To the rest API right create relationship Which you say you pass in the type you pass in a start node you pass in an end node and then whatever properties and you're done It's a lot cleaner than this Right and I mean I guess eventually the goal is to be a little bit more To kind of abstract all this stuff out and be a little bit more active record like for key maker But for now it works and it's pretty cool. There's the same thing basically for humans, right? So So program and then humans the same thing and then if you want to get for the humans You're gonna get the first-degree network For that instance you would just for this you have to kind of do this You can't really hide this so you need to get this is how you get the first-degree network and It uses the instances name So to find to where you're starting and then it says give me all the network stuff And then that gives you the network the first-degree network now the fourth degree network It's different than getting all all of them at the same time you want to get only the fourth network You would do something like this, right? Start with me Who knows the location of the first who knows the location of the second who knows the location of the third and finally who knows the location of the fourth Right, and then you execute that cypher and you pass in the name and all that stuff and then you're good to go So you would say I'm gonna create me or I'm gonna create all these other guys, right? Or find them create them and find them and now you have all these Variables and then you can say I'm gonna create all the relationships, you know that basically we're creating the Graph that I was showing you by doing this stuff and then eventually finally you just say give me Niels first-degree network and then give me the fourth network which would give you eventually a keymaker And you're done. That's it You're good to go So questions Yes Yeah Yeah Yeah, so the question was what kind of It's just that we had synchronizing Postgres with the graph that it was Neo4j and what we do is we we basically keep a Neo4j ID in active record is so in postgres and we have And we have the opposite in in the graph database We have an active record ID and as a matter of fact a keymaker does that for you Keep that stuff in sync It has it and how to keep them in sync we we use observers Probably wouldn't do that again. I'll use something else Some some other way of doing it But but yeah, it's it's it's it's part of the challenge of having a polyglot Database environment, but it's it's worth doing because it really simplifies how you would I mean think about creating that crazy Union all Sequel that would be insane like so Any more questions? Did I just speak through it? Yeah, what's up? Would I not want to use a graph database? since it's so It's so general purpose that you could use it for everything and it's pretty performant I Can't really think about I mean, I'm not gonna tell you to go and only use a graph database because it's just not realistic, right? But I just can't think of anything that you couldn't do my graph database It's pretty cool Once you start thinking in that way because again, it maps real close to how we think what's up. Oh, yeah Yeah, yeah, right. So the question is how do you? Maintain referential integrity very carefully Yeah, I mean because again, you know, that's part of the of the pain points whether you have to make sure that your your Rails app Well, I guess that would that could be the Responsibility of keymaker right or the library that you're using to keep those two things synchronized and and I will keep that in mind and by the way keymaker is That's my that's where it is if you want to fork it and take a look at it and and stuff and help me out That'll be great Again, it's it's kind of what I've been focusing on with Travis. We've been working on it together is We've been trying to Get all the low-level requests done. So the first layer to be You know tested and all that stuff and then kind of move up You know because right now that all three layers are there but layer two and there three are a little bit, you know, not all the way Yes, yeah, it does and that that's with Cypher now It's worth mentioning that they do have a different So Cypher is very expressive as you saw it's it's very verbose and I don't mind that I actually like that the the counter Plug-in that they had they had two plugins going to had Cypher and they had a gremlin Gremlin is like very very concise like G is a graph dot capital V Perence is the entire graph. It's very very very maybe I'm getting that wrong, but anyway, it's very concise So concise is kind of cryptic at times, but it still works and you can still use it and You know, you can do all the algorithms and stuff like that like breath first or you know that first and like cool stuff. So Yeah Any more questions? Yes, oh how to oh, well the way that we did it is we did it with observers So anytime that there was a save or a review So what sorry the question was sorry, can you repeat? Yes, there are transactions in graph databases So are there transactions with graph databases and how do you synchronize them with postgres? The way that we did it is we use observers every time something was updated on postgres. It was synchronized with With a graph now we would we could Recreate the graph database with everything that was on postgres and that's how we cannot say That's what we said that postgres was a system of record right cool have time for one more question Who's gonna be the lucky winner Nobody oh yes right now it's It had like 12 or 15,000 nodes and so so the nodes Start to multiply because in the properties Then it's like you know because each node could have properties and then the relationship properties So yeah right now it's about that because we're still testing it locally the Neo4j guys could probably give you better They're they're really good. They're they're awesome community. They're very helpful guys that new technology is the parent company Kind of like Basho correct me if I'm wrong is the React company kind of like that So Neo technology is the parent company of Neo4j, but it is open source. So Cool If you want to know more about databases Read this book. It's pretty cool seven database in seven weeks. I think somebody else mentioned seven programming Seven languages in seven weeks. This is pretty cool by Eric Redman and Tim Wilson Yeah, so thank you