 Welcome to Advanced Operations and Features with Ryan Lambert. This final session focuses on post-GIS scratching the surface of a range of the advanced features that post-GIS is capable of. Today's session will cover preparing data in different formats such as GeoJSON and MVT for external dependencies, 3D rendering and routing. My name is Lindsay Hooper and I'm one of the Postgres conference organizers. I will be your moderator for today. I'm here with Ryan Lambert, as I said earlier, who's the owner of Rustproof Labs. Ryan's been working with GIS since 2011 and he got his start in working with PostGIS when on a quest to update a roadmap he started using PostGIS, Postgres and OpenStreetMap. He has since been a contributor to the OpenStreetMap project for the last five years. Ryan's given many talks on PostGIS and he's currently working on a book on how to use PostGIS and OpenStreetMap together. We're expecting the session to run just around an hour and it will be heavy on the demo. So with that, I'm going to hand it off to Ryan and enjoy. Hi everyone. Thank you Lindsay for the introduction as always and thank you to PostGIS conference again for hosting this entire webinar series. And of course, thank you for everyone who has joined us live for today's talk. It's going to be a lot of fun. So today we are on the final of a six part series. It's really exciting to me because one this is a really fun session for me. And also it represents me closing out a very big project. So it's always nice to write something off and check something off the list. The whole series so far has been published. The videos recordings are online and the link at the top of this page has the intro video as well as links to all of the other post videos for this series. And after today's session is done, I'll get the video cleaned up and published as well in the same format. So today's agenda, we're going to spend a little bit of time talking about extensions beyond PostGIS that integrate with the PostGIS functionality and take it a step further. I'm going to spend a decent amount of time going over some functions and features that I personally find cool. And then we'll spend the majority of the time working on a demo using open street map roads data and PG routing to get us from point A to point B. So on the note of the demo, I have made the demo data available, and I'm going to copy and paste this link right here into the chat so it's easy for everyone to get to that page while we're going. There is a download it's just under three megabytes for a small extract of open street map data, and there's also a link to download the SQL script that I'm going to be walking through in that demo session. You can download the data, you can download the code. A lot of times it's easier to follow along, especially with these complex queries if you can get the code up on your own screen and focus in on the snippets of code and zoom in and scroll as you see fit. So that is all available and will continue being available once the video is posted as well. So post GIS itself is a lot of fun. There's a lot of great powerful functionality that's built into the core post GIS extension, but that's only a part of what we can do the there are a number of other extensions in the post GIS ecosystem that can be added on to further the functionality of your spatial databases. One note on the top item there the post GIS roster extension that is only split out as of post GIS three and newer. So if you're using post GIS 2.5 or prior, the roster components are built into the core post GIS extension. But as you move into the newer versions of the extension you'll have to install those components separately. Another key piece that extends post GIS is the PG routing extension, and we'll be spending a good amount of time working on a demo to put this extension to use today. When I when I look at these extensions one way to kind of look at how much is packaged with each extension is to take a look at the number of functions included with each each extension. So this kind of gives a breakdown of some of these more popular add ons to post GIS with an idea of how much additional functionality comes along with it. And now to be fair, counting extensions in a database is sort of like a, you know, counting lines of code and any other programming language. It's only useful to a certain extent. So, you know, take it with a grain of salt because one function can do a lot or little. So just counting the number of functions itself is a fairly shallow indicator of functionality. But it is a good way to get kind of an idea of what's involved in each of these additional components of post GIS. And now on to some functions and features that I personally find cool. And when I when you brand something as advanced is always a little bit of a risk because it's hard to know what is advanced to everyone. And so some of what I'm focusing on here is just simply neat things that you may not have otherwise accounted in any database you need to be able to ensure some level of quality and post GIS gives us some quality control functions. So we have the ability to check our, our geometries to ensure that they are valid geometries. And that first function on the list will just return a Boolean truer falls for each geometry. So if you want to get down to just the invalid geometries you can set your filter that way and get down to it there. And then the second function will actually tell you what's wrong with the geometry from the post GIS perspective. So look at your source data and hopefully remedy the situation situation and get to valid geometries in your database. Some more geometry quality control functions are available. You can check to see if your data is closed. You can check your polygons for what method of winding they used whether it's clockwise or counterclockwise. If you are using open street map data you may you may notice that all of your polygons are counterclockwise. And there are two standards there's half of the standard world uses counterclockwise winding for the polygons the other half uses clockwise. So depending on where your data comes from and where you need it to go you may need to pay attention to what directionality is in your polygons. These are available for you to look at. And there are a number of other what I would consider quality control functions. And so if you look for function names that start with ST underscore is that'll get you a good list of other functions inside of post GIS that can help you kind of look at the quality of the data and and work on bringing the level of quality up. Another way of enforcing quality control a little more specific is with addresses. If we have if you have street addresses you often you often need to get them into a more standardized format. And luckily we have some extensions that allow us to do this. The first one here is the address standardizer extension. It does require that you also install some data. The data underscore US portion is included by default. If you are outside the US I'm not certain exactly how you need to go about getting those rules and that data set. But I assume there's a little bit more documentation around that I personally haven't had the need to standardize international addresses so I haven't gone down that rabbit hole yet. But this is here and when you run the address is your street address through an address standardizer. You're going to get back a tabular result set it's going to parse the data. Standardize it based on its ruling and then spit out some data on the other side. And in this case we get this STD ADDR result and the docs have more information about what all this means. One of the key things I take away from this is I notice things like street is spelled out in full instead of being abbreviated as ST. Also Colorado so our state name is also being spelled out in full instead of using the postal abbreviations that we have. So there's it parses out the data and it puts it nicely standardized for us. The address standardizer I have more experience with is packaged with the tiger geocoder extension. And it's called it has the PA GC underscore normalize address. So this one I'm and when I'm the standardizing addresses it's almost always because I want to geocode them. And because I always use the tiger geocoder I've always use this function that's packaged with the tiger geocoder because it's intended to work with those geocoding functions. One of the other advantages that I see from this is it's a little bit easier you don't have to install that second extension and the function call is only the address you don't have to pass in those extra parameters. The other side to that is it's less customizable and yours really just stuck using with the standardization that they have built into this function. Now the result from here is very similar to the previous standardized address. The main difference here are the column names are going to be different. For street name here the prior one use just name. We also see that street is now abbreviated as ST and our state has been abbreviated as CEO. So if you have the need to standardize your addresses. Take a look at the various functions that are that are available to us and figure out which one is most appropriate for the needs of that you have at hand. I personally have used this one because I am doing the tiger geocoder. Now the tiger geocoder is a really cool extension. This handles a lot of the underlying steps to getting the data in in the first place. There are some helper processes that handle downloading the data from the Census Bureau website the tiger line data sets and it loads it into the database in the format that it needs for the geocoding. The link here is to Michelle Tobias's blog. She has one of the best write ups that I found on the process. My procedures that I use link to her hers as the source mine just has a couple changes for Linux specific setups where her blog post was covering a windows specific setup. So there's just a couple things you have to adjust for windows versus Linux, but this has been a really good resource for standing up a tiger geocoder from from scratch. Now this one of the really fun projects I've had in the past few months was I had the need to create a grid and around a specific point on a map and I wanted my grid to go out in all directions for a certain distance. And I found this combination of functions was really kind of fun to pull together ST project is kind of the key element of this. And it you one thing to notice here is the first argument of the ST project function is a geography, not a geometry. Throughout this whole series so far we've dealt exclusively with geometry data, but this function does require it to be in geography format. It also so we can take that what ST project does is you take your starting geography, and then it can calculate a new point. That's a certain distance. That's that second parameter and then a certain direction. And that's the azimuth and the third parameter so we can say I'm here now, but I want to go 250 meters to the east. So that's kind of how ST project works. And then I'll talk through those other two functions as we go through the this example code. So the end result that we're going to get to is a picture like this. We have in the very center of this grid. There's a red dot, and that's our starting point. And what I've done here is that that's the center of the Coors field polygon. And then from out from that center point I wanted my grid to go out in in the four cardinal directions for a certain distance. So this is the goal for the code that I had to write. And what I ended up with was this, this CTE common table expression query, this could be rewritten as to use sub queries you could use temp tables or full tables. There's a number of ways you could rewrite it but I personally like CTEs for keeping my logic together. The first part of the CTE is defining the start point. And so what this first little query does is it filters the buildings down to just Coors field. And it grabs the centroid or the center point of that polygon so that's how we get a single polygon a single point out of much more complex polygon data. And then because my project function needs a geography, I am first going to transform my geometry into the four, three, two, six SRID. And once I've done that, then I can cast this whole thing as a geography to get it in the format that the next two steps will meet. So we have our start point, that's the red dot that we saw in the middle for Coors field. And then we're going to pass this into the west grid portion. And I really should have named this east west grid because it goes both directions. So what we do in this step is we start with our center start start point. And we are going to join that to the result of the generate series. And what the generate series does is it simply creates a series of from a start point, in this case, negative 10 to positive 10 with an increment of one. So I'm going to get a total of 21 rows, each with a single number on it of negative 10 to positive 10 with zero in the middle. And so this has been aliased under S. So what I do with ST project now is we pass in the start point that we want. And then we say we want to go this far. And so I'm passing in my range of numbers here and multiplying it by 250. So this sets up my 250 meter grid. And then to get the direction, we can use pi over two. And from a geography standpoint, this ends up being east. And so we're saying we want to go if the number here was one, we would be saying start at the point, go 250 meters to the east. Because our series has a negative 10 to plus 10 range, we're going to get some negative numbers in there and we get our grid out from the to the west and the east from our center point. So with our West grid that has our horizontal line our east west line of 21 points, we're going to take that data and pass it into the last step. So we have the West grid here is our base table. We're again applying it to the generate series results. So we're taking those 21 points that went east to west and we're applying it to another range of negative 10 to 10. So now when we use ST project. We're still doing a 250 meter grid that's applying to our negative and positive numbers. But the direction we have set to zero and that translates to do north. And so because of the negative values we also get the south so we get 10 steps to the north and 10 steps to the south. And after this projection has happened. I can cast the data back into a geometry, which defaults when you do this function from a geography, it goes into the 4326 SR ID. And then I transform it in the outer step back to 3857. This matches the default projection for all the other open street map data that I have in this database. So the logic that's going on behind here takes a while to absorb and kind of work through the math. I set I sit down with pencil and paper and I really work out what the things like this kind of mean when I'm trying to digest. What's going on in this query. But then once you kind of get it figured out it's really, you know, 15 line query or something, and it does kind of a cool thing. You can make a grid around around points and there aren't a ton of applications for this type of project but when you have the need for something like this, and a small amount of Google food can get you to a functional example fairly quickly. I think that's really, really powerful. And then we're dealing with routing which we're going to spend some time talking a good amount of time talking about, you're going to work with a lot of line data, and lines have a number of complexities with them, especially when you have really long lines, and you're trying to find a point in the middle. And what these functions here do, the first two functions work on a line they allow you to find a point within a line. So the idea is I'm here and there's a road over there, where it where is the closest point on that road over there from me. So that's kind of what the first two functions allow us to get. And then the third one allows us to actually split the road at that point at a specific point. I'll show a quick little bit of an example of how these functions work but if you really want to get into detailed accurate precise routing for the start and end points you some tricks like this are going to be very, very helpful. With our polygon data we end up with a lot of complex shapes. In this case I've shown a building that has a number of cutouts in the middle of it so these are open plazas in the middle of the building that you can only access from going through the building. And they get cut out and it's another place you'll see this is if you have a lake with an island in the middle you'll have your polygon for your lake, but then there'll be a cut out in the middle where that island is. And a number of reasons why you have these complex polygons and this num interior rings functions gives us a way to kind of identify those and work with them from there. Whenever you have data within a database system. One of the natural questions is how do I get the data out and into this other system and post GIS has a bunch of different ways to get your data out and into other formats. This list is just a number of the ones that I've used more commonly. Again, if you look through the full list of the post GIS functions, you can look under ST underscore as, and you'll see the full list of functions that can convert into other external formats. 3D mapping is one of the really cool things that I'm interested in. I've had just barely started scratching the surface of doing indoor mapping for being able to do routing inside larger buildings. One of the things that I found that I wanted was the ability to kind of model and visualize that space. And I haven't gotten very far on this project yet but by using the SFC GL extension that goes with post GIS, along with the X3 DOM project. I've been able to pull data out of post GIS extruded the ST extrude function can take a 2D object and pull it out into 3D. And then the X3 DOM project allows you to actually visualize all this in the browser. So I'm playing around with some different ways to go through, go about that process. It's a lot of fun and I'm really looking forward to seeing continued development and projects based around some of this technology. And one of the things I love about post GIS is how well it works with QGIS. I absolutely love the QGIS desktop software for GIS applications. It is fantastic. And one of the reasons why I love it so much I think is because I have the ability to save my styles directly in the database right there with the database, the data itself. So this layer styles table will hold the style information from QGIS. It's just a big blob of XML data. And so just it's pretty simple to store, but the fact that it's enabled and easy to do by default, and then you can back up your styles right there with your data. You don't have to rely on analysts to save their styles in the right place. I don't have to remember to back my own stuff up because I back up my databases. So this is a really cool thing and I will show a little bit of how this works at the end of the demo. And on the note of the demo, it is time to switch over to D Beaver and we'll start getting our hands dirty with some code. So this is the script that is available to download from the website. And feel free to follow along. If you do have questions about the code as I'm going through, if you can make a note of the line number that I'm on. If the line numbers are displayed here on the left hand side, that will greatly help me come back and answer the question if if I've moved on from there. So that'll help me just get to the right place to answer your questions. So the first thing I'm going to do is just double check to make sure I've installed PG routing. If you've installed loaded the demo data you already have post GIS and H store extensions installed they are required in order to get the data loaded to start with. And then PG routing will be needed here in a little bit. And I just want to point out again I love the if not exists syntax that post grace has. Not all databases have this succinct syntax to allow you to try to create something and if it already exists don't worry about it. I can run this over and over and I'll never get an error. I think that's really cool. So this first code example I'm going to run the code so we can see the initial results. What I'm doing to start with is I'm setting up the beginning of our routing problem. So I'm, I'm pulling out the start point that I've defined from natural point. We're pulling out just the peak that is south table mountain. So this is where I want to start my routing problem from. And I am joining to the roads data. I'm casting a net that goes out 50 meters so the STD within function will allow us to select roads that are within 50 meters of the point that I've selected to get us those nearby roads and we see down below we have two rows that were returned. One is classified as a track one is classified as steps. And the line locate point function is the next column. So this function takes a line geometry that's the R dot way that's our road and a point geometry. And what it's doing is it's telling us where on the line is closest to the point that I've also passed in. The result doubt that we get is going to be a number from in the range of zero to one. If it's either zero or one, we know that we're at the end of a line. If it's a number like a point 387 we know that this nearest point on this line is about 39% of the way through the distance of that line. So that is of itself isn't super helpful. But we can pass that line locate point into line interpolate point, and it takes that number and the original geometry for the road, and it returns us the actual point that is on the line at that spot. I'm going to flip to the spatial tab here and the D beaver output so we can see what this looks like on on the map. We have zoom in here we have our center point here this is the south table mountain peak that I pulled from that natural point table. And then we have this orange line it looks like one orange line but that's really the two different lines that we had together. And then the two blue dots indicate the result from that line interpolate point. So we can see that it where it decided where it found those nearest points. This is very helpful to split apart your lines. And as we see, as we'll see as I go through this routing demo, my routing demo doesn't use this method it uses a very simple nearest neighbor search. And what we'll see is that it's not perfect. And if you really do want those polished perfect results, you're going to have to throw in some tricks like this to kind of split apart your strings, or your lines and find the right, the right start point for your route. So I've decided that I want that south table mountain point to be my starting point. I'm going to go ahead and just save that geometry here as the start point column in a new table. So we have a start point column in the table called my route and I'm going to build on this tape this table as we go through. And this is much like I would do in a full production run, but normally instead of having one start point one endpoint, I would have a large number of start and end points but it's, I find it's very helpful to materialize this information in a table, especially when you're going through the troubles, troubleshooting and exploration phases, because when something doesn't work right your route comes out funky your times are off your costs are off. This allows you easy ways to kind of get in and figure out what's happening each step of the way. So we have our start point but a start point isn't very good on its own we need somewhere to end. So our alter my the my route table and add an endpoint geometry column. And then I had this next query will update that endpoint geometry column using the centroid function. So again we're pulling a polygon. I want to get to this panjaya coffee roasters after I've been on this hike. But I don't want to try to compare to a polygon I just want that point. So we use ST centroid to give us a nice single starting point. So I said and run this query to take a look to see what we have inside the route my route table. Flipping into the row view we have the start point and endpoint. And if I throw it over onto the spatial viewer. Go ahead and click here so you kind of can see where I'm looking on the right hand side we have the south table mountain start point. And over here on the lower left side, we have our this orange endpoint that is the inside of panjaya coffee roasters. So now we now know where we want to be where we're starting from and where we want to get to. And at this point of any routing problem you need to kind of consider how you're going to set up your cost model. Routing is largely driven based on costs. And so you need to define what that cost is a lot of times we want the fastest route somewhere and so speed is going to be a very applicable element of our cost if I want to get there as fast as possible I want the fastest route. You may want to use distance. That's pretty easy to do and a lot of simpler example simple examples show distance because you don't have to work too hard to get to that that distance of a geometry. You may have a preference on I just specifically do not like this highway so you could put cost for certain highways extremely high to help the router go around that there's a lot that you can do with your routing cost model. Whatever you do it's best to set up a helper table a lookup table that can store this meta information about your cost model. And so in this case we have our road subclasses I use this code column throughout the anything that's based on the PG OSM project has this code embedded everywhere. So I use this code to join to my real my main data, and then I have the speed and miles per hour. You could easily set it up for kilometers per hour that actually make math later on much easier. And then I have some Boolean indicators here for whether or not a road is routable for motorway for motorized vehicles or footway traffic. So I this allows me to select the roads that I want based on multiple attributes, not just the speed and costing kind of information. So this table this routable table will be joined to a number of times later on in the as we go through this example. Roads need prep for routing that is a true statement that is always a true statement and you almost always have to put effort into cleaning and preparing your roads data before it's ready to be used in a routing project. I typically create a new table with a subset of my roads, as well as some specific attributes cost information other details I may want later on in the routing process. So I have this big long query here that creates this table PG OSM routing roads. The columns it's selecting is great grabbing some general attribute data, and then I'm calculating the cost as a function of length here, as well as renaming the geometry column from way over to the GOM. Some of the PG routing stuff it seems to expect the name being the GOM. And so I just kind of standardized that when I when I'm going into a routing phase I recap re alias my names to match where I'm working at that moment. So one of the first steps I do in these restructuring steps is I always like to whittle down my data to the region that I'm actually working with. Now the example data set that I'm using for today was already limited to three kilometers within golden. So limiting it here again and doesn't have a whole lot of value just because of the actual demo data I'm using, but I almost never have that small of a data set to start with I normally have at least all of Colorado loaded if not the entire US West region. And so if I'm doing a routing project here, you know, locally in Colorado, and I have all of us West, there's a whole lot more work that all of that the database engine has to do to get me those answers. So this can help save your performance boost your performance quite extensively when you're working with larger data sets. And so once I with this sub query that gets me a nice compact envelope that I can use, I can now limit down my roads line with a simple bounding box filter here using the and and sign. And then I'm joining to this routable table, as I mentioned we're joining on the code column. But here, because I'm starting at the top of a mountain and I'm walking to somewhere local. I'm going to exclude anything that's a motorized path. So this will keep me on the sidewalks and the hiking trails and avoid the local city streets that are really intended for driving, not for my footway traffic. So I can go ahead and run this query. And when it creates the table, we'll see that we have 2278 rows in the resulting table. So by limiting to the golden bounding box, we did limit out some rows of that road data. And then by limiting out by excluding the motorized paths, we further whittled down the number of roads that we're working with. Routing problems are inherently complex. And so I'll keep saying it the more you can whittle down your data and get it to really what you need for the problem at hand, the better your performance will be on any given hardware. And of course, don't forget the spatial index that just index should be created on your geometry columns as you're going around creating these new extra tables. I'm going to just ensure a couple of these tables don't exist. The PG routing functions coming up will recreate these functions. And if they already exist, they will fail. So I want to make sure they're out of the way first. So the first step to do with now that we've set up our base roads data, we have a little bit more cleanup to do still. So the base roads data needs to be what we call noted all of these long road lines that have other road ways that intersects in the middle of the line. If there's if it's in the middle of a line, the routing algorithm will not pick up that it can actually deviate path there and switch over to another line. So what the PGR network PGR node network function does is it looks at your roads data and you pass in what table you want it to look at, and it will find all the potential intersections in your data and split that out into separate and into rows of data. And it takes a couple seconds to run on this small data set if you have a very large data set this can be quite expensive. And the resulting the result of running this function is you get a table with underscore noted at the end is the same table name here just with the underscore noted. So if we have our noted roads layer up, we need to create the topology. This is how the routing functions know how it can get from this line to that line and from point A to point B is it creates this topology that maps out all of the different paths that it can take. And it goes and creates another new table. It uses the noted version and then adds vertices underscore PGR to the end so we've now created two new tables in our database based off of that roads table that I had created. The last step you probably you always want to do is run this analyze graph. This is a quality control step that you is worth taking some time to get into an understanding. This helps it checks all the different vertices that it knows about and all the different line edges that it knows about. And it looks for potential problems. And, you know, so one potential problem you could have is if all the if there's a bunch of one way roads that meet in the middle, but no way out. It can detect that that is an invalid setup that look if you drive on any of these one way roads you get to this point and then you can't leave. So it can highlight those it can highlight dead ends. So and some other problems so if you want to find potential issues you can query that PG underscore PGR table where that CHK column is more than zero and inspect your data from there. For the sake of today's demo. Sorry, I'm skipping ahead. Quickly looking at the roads data that I generated those just the footways versus the noted data. This just shows that we originally had about 2300 rows. And once we ran it through that noting function we ended up with about almost 6400 rows. And that means that we have more geometry data, it just split the geometries apart into more rows of data. So just be aware that that does happen. And then one really nice feature about how all this works is inside of this noted table that it that function generated, it gives us a column called old ID. Matches back to whatever was the ID column that it was built off of and in our case, this was the OSM ID. So if I take a look I'm going to we're going to walk through a few steps of this noted version. And if I select this first geometry here. I'm going to zoom in so hopefully you can see that little tiny blue line right between the intersection of these dotted red lines. So that is one portion of what was noted out. If I continue down here and just select one line at a time, we can see that we have some, some much smaller sections and there's, you know, there's a lot of intersections where these small line segments end. So we can kind of walk through the segments there if we wanted. I can also select the full column, and we can get an idea of what that full original line look like. And if we hadn't gone through the process to to node the data to split it out. This would be treated as two endpoints there would be an endpoint on this side, and an endpoint at the northwest side. We would not have any ability for any of these other paths that are up here to intersect in the middle they would not be considered valid routes. So if you do need to ensure that you have the noting done otherwise you won't have these valid routes. And if I go, I'll run this query again, the second query using OSM ID back on the original table so this was loaded with the demo data. And if I flip out of out into Roe view we can see we have a single row that is linked to this ID. And if I select that geometry, we can see there's that full line in its original glory before the noting was done so we really did have it on truly one line. Okay, so now with our noted table created I'm going to add a few more columns to it that I can use later on. We're creating one to store the length as a fun, the cost as a function of length. Another one to store the cost as a function of minutes, and then another one to just store that code to make that PG OSM uses in order to make our joins work easily. So we've added three columns to that noted table now. This next query is simply going to update those three columns. What we're doing, we're updating the noted table. And because we have that old ID in the new ID, we can use that to join back to our routing roads table that I created. And that allows me to one get the code that I had stored in that table. But it which then allows me to join to that routable table. The length that we're calculating is a simple st length function of the geometry. And then the minutes uses that length again, along with our metadata that we have for max speed based on the subclass of the road type. And then we have some other math that handles making sure that my miles per hour combined with the length in meters works out to be a time in minutes. And I'm relatively confident I haven't messed that particular math up. But if anyone finds that there's enough little steps in there that if I have made a mistake, I'd love to know so I can fix it. So this just goes through and updates and persists this data in our noted table so we can use it in the routing functions. Now, if you've done that analyze step and you've validated your data, you may have gone and done some cleanup. But what I'm going to do here is do some cleanup in the form of deleting anything that doesn't have a source value. So this this essentially means this these roads are unroutable. And if you don't remove them from from here later on the routing function itself will have a hard time with the null value so I just go ahead and remove them I've looked at this data and for the case at hand. The data that we have removed is not pertinent to the project. So at this point, our road data is now routing ready, we are able to pass it into any variety of routing functions through PG routing. We have a little bit more that we have to do before we can run that route though, we have to figure out where we're actually going to start and end our route on. So far we know where I want to be I want to be standing right on the peak of South Table Mountain, and I want to get to inside the coffee shop. So we have that established but we don't have roads exactly to either of those places. So what we're doing now is finding potential start points. So this query here, we are pulling our endpoint, and I'm finding the 15 nearest points, ordered by distance from our, our start point so these are the 15 closest candidates for our routing function to be able to choose. And if I zoomed in just a little bit here you can hopefully see that most of these are on these dotted red lines, which are sidewalks. So one of these dots is going to be the route that is picked by our simple nearest neighbor method. These particular nodes the these come from the vertices PGR table. Each one of these has a primary key associated with it and I've brought it in in the query is our ID. Right here if I click on the data on any one of these points we can see the primary key value for that row. This is the value that we need for the routing function we need to know the ID of the vertices that we're going to start at. And because our routing needs to know the IDs for our start and end points, I'll go ahead and alter my route table one more time and add the start node and end node so we can go ahead and persist that data in with our table. And this helps make troubleshooting the start and end points of your routing algorithms much easier is if you actually save what you chose it is a whole lot easier to come back and check to see what you did. And therefore why it did or didn't work the way you might have expected to. So we have our points we have our start point and our endpoint, and then we have to know values because we haven't set those nodes yet. The next two queries will handle setting those values there. I've basically taken the prior query where I was showing the top 15 nearest ones. Now I'm limiting it down to simply the first closest point. And in this case I'm starting with the start point here so I'm doing a D within on the start point against the geometry. And it's important to do some sort of limit on how far out you're going to throw your net for choosing your start and end point. If you happen to have a start point that doesn't have any valid routes within 10 miles. If you don't have a sort of limit like this your your algorithm will pick up that start point even though it's literally 10 miles away and run around it. It's going to be invalid because it's missing out on everything between where you are and where it chose for the start point. So I do recommend that you have some way of limiting near to nearby areas that if there isn't anything within 200 meters, this point would be considered unroutable. So this will run and update the start node ID. The next one is the same I've just switched we're now looking at end points and I am setting the end node ID. Go ahead and run that. So this next query, we're mostly looking at the data stored in my route. I have joined to the vertices table twice. Once so I can join for the start node and another so I can join for the end node, but this will actually allow me to see what we have what we're working with. So on the left here I'm selecting the start point where is that south table mountains where I'm standing. And I've also selected the point that is where that has been chosen by our nearest neighbor search for where we're going to start. So we can see we got we got a point over here. I can do the same with the end point and we get the center of that polygon. And route and we can see that we got one of the blue dots on the sidewalk next to the building. So these are the points that have been chosen. I can select all four of these at once so we can kind of see the whole layout. Start side on the right and side on the left. Now we get to the actual routing portion. I'm going to go ahead and run this query so we can see the result. Let's get it in grid view first. So I've ran the query. The top portion in the select clause is quite uninteresting of this query it's just pulling out some data for us to look at the real magic happens inside this function. I call this the routing DJ. I'm not quite certain how to pronounce D I J K S T R A. So I've just always called this my routing DJ. So the DJ needs a query in the form of a string that's going to get it the data it needs in order to run. So this query, this is where you get to tell it what costing you're using. So if I right now I have it set up with cost minutes. If I wanted to find, you know what, give me the closest one as far as length, I could quickly switch this to cost length and I would get a completely updated algorithm based on a different cost. The other parameters that the function needs are the node that it's starting at for the route, and the node that it's going to end up for the route. And then the last one, the last bit that I have set to false is about directionality and one way of this. So I was in a car, you know, doing a routing for cars and I wanted to observe one way signs like law abiding says and should we would set this instead to true in order to get it to take the one way street into effect. Considering this is a foot model most footways don't have one ways associated with them. I'm ignoring that for this particular example. And then at the end here we're joining to the noted data. This gives us some metadata and then our routable for some more metadata. And then I'm ordering by this path sequence column. So this is the this is going to force that the rows are in the order that they would actually happen in the route. So if I'm going to click on this first geometry here and fly out the side panel so we can see it. The first geometry is indicating that it's going to have us walk around the entire top of the mountain before we start descending down the stairs. It, the point that it discovered was right here at this junction. That's where it decided to start, and apparently we have to walk all the way around, and over here to get to this. This is obviously not ideal, unless you want the panoramic view one more time but from a routing perspective. This is where you get run into limitations with a simple nearest neighbor selection methodology. Continuing down our route I can select the next geometry and we're walking down some big stone steps, and I can keep clicking through and select a few at a time and we can kind of see the way that this route is coming together. I can select the whole thing it'll point pull up the whole route for us. Flip over to spatial and now we can also see our start and end points as well so we have the start point in blue up here, the orange line is our route. And then we it routes us over here on to here and if we can see that it is avoiding our city streets. It's having us walk down sidewalks this is good thing it's a safety I guess. And it's going to get us to where we need to go. Now, when I'm, when I'm working on a project like this I normally personally don't care about the individual steps, other than making sure they're valid but really what I care about is the aggregated route. What's the overall route look like what's the overall cost. These are the details that I care about because I'm coming from this at a more of a large scale analytic perspective instead of a single end user going through, you know, an app on their smartphone kind of perspective. So I'm not concerned about the individual steps. And so I've created a geometry to store my route information and cost information. And I've simplified the prior query this is essentially the same query has a few less joins. And I'm also taking an aggregate of the cost here I'm summing up the cost for each step into a total. And I'm doing the same with the geometry I'm summing all my geometry up with this ST collect function as one single multi line stream. And this, this inner portion of the CTE it's aliased here as a gets used down at the bottom in order to do the updates we're setting that cost minutes column that I just added to our aggregated cost. We're setting our route column that I added to the aggregated way. And then I'm using that aggregated way along with the ST length function to recalculate the cost length here. So I can go ahead and do that it'll update my single bro. And if I look in my table now we this my route table we've been building. We have our start point. We have our endpoint. And then we have this single column here for route that has our multi line string. So we can see all of this together as one. And we all we can get back into the start and end nodes so when I'm troubleshooting this later on I go I don't think that's right. I can actually get in here and look at these nodes to see what it selected and maybe work on improving that algorithm. And I also have the total cost in the form of minutes here it's saying it's going to take me just over 38 minutes to get there. And the overall length of that is going to be two point just under 2.4 kilometers because this data is in meters. So we have all this data persisted here in a table. And now I'm going to switch gears just a little in back to the layer styles that I promised. So layer styles table is our method of is a method of saving the styling information for QGIS directly in the in the database. So we have the table here and this is included with the demo data. The table catalog is what database that you're it's expected to be in schema name table name. So this is all used by QGIS to figure out which layers should apply to which styles. And then what really neat option is the use as default column. So when this column is set to true and DB for visualizes as a checkbox. When this layer is pulled into QGIS it'll automatically grab the style information and apply it to the style so you don't have to do that manually. And just a real quick peek into the style QML column. This is the raw XML that QGIS uses. We can see right here that it understands its QGIS. And you can kind of see through it's just a whole bunch of gibberish related to styling. So I'm going to flip over to QGIS now so I have a blank project up here. And I have already connected in my browser I've connected to this database for golden OSM demo. I'm going to refresh the catalog because I've been you know dropping and recreating objects I want to make sure it knows what we've got available. And right away I'm just going to go over to this public schema. I can't get my prior testing cut completely cleared out. But I can bring in this information from this my route table, and it'll draw it we've got the two points it's the shape that we've been seeing all along. It's kind of nifty. More importantly, back to the styling information. To expand the golden schema where our source data is, I can say grab this building polygon. And if I drag it in. We get, I'm granted it's boring gray with some basic boundaries doesn't look too exciting to start out. But if I zoom out from the map what we'll notice is just there it became much more transparent the styling changed and the buildings became less prominent. Moving out one more time the buildings completely disappear. So I already I can tell that I have some style some zoom level base styling applied. This is super helpful for analysts working in this type of software. If they have to manually set up these style levels levels all the time it's extremely time consuming and a better example of that is going to be on this roads data. So I just brought in the roads line the styles that were saved in the database automatically applied, going to drag these down under our routing stuff. And as I zoom in or zoom out will notice that again data disappears some of the sidewalk data, the less important roads have disappeared. As I continue zooming out more of the roads disappear until we're left with just the major roads in the area. So these styles are quite complex to build so that I just pulled up the properties that has the symbology for this. And these rule based styles get quite complex because you have to have classifications for each type of subclass that you're styling, and each subclass has multiple layers of over here we have the min scale and max scale columns. So if the analysts had to go through each of these subclasses and set up all of these zoom levels that's extremely time consuming error prone and not fun at all. So being able to save this information directly in the database with the data makes all of it makes getting to this end result much easier where the styling just applies, and I could focus instead on getting my styling of these new layers set up so I can actually see what's going on here. I don't like spending time styling layers that I've already styled a dozen of times in the past. So I can quickly copy and paste styles once and just on the layers that are actually new layers so I can make the start and end point more visible for example. So getting into styling QGIS is far beyond the scope of what we are talking about today. And with that, that is the end of the demo and the end of the session today and the end of the series. So again, thank you everyone who has signed in and joined along live for us with us today. Have a fantastic day.