 All right, good morning everybody. I'm Leo Su, Regina O'Bay. So today we're going to give you what we think are the 10 most relevant problems that you can solve by post-GIS. Post-GIS is the spatial extension to post-grace. And by that, you add spatial data types such as polygons, lines, points, et cetera. But post-GIS actually goes much further than that. And we hope through these 10 problems or that we're going to solve that you can actually get a breadth of what post-GIS has to offer. And you'll find that it's a lot more than just simple geometries that you're working with. I'm going to turn everything over to Regina. And she's going to present the first eight or nine problems. And if there's time left for me, I'll do the last two or three. OK. So the very first problem that post-GIS is used to solve are what I'll call proximity problems. And these are questions like, what's the distance between this and that? Are these set of points within a certain distance of this other set of points? And also aggregation based on containment. So how many things are within this region? So if you have, for example, a sales region and you need to catalog how many people are in each sales region, you would use these functions. So the first set is finding the closest things to a point. And here for this example, I try to find the first, the five closest restaurants to a particular location. And this is using the geography data type, which uses longitude latitude but returns distance in meters. The next common thing is how many things are within a particular area. And the area could be either a point, so within a distance of a point, or it could be a polygon, it could be a road. And so this would, you'd use a function called STD within, which takes two geographies or two geometries and a distance measure, which is how close you want to compare things to be. The third is containment. So if you want to count things, you can use standard SQL aggregation counting. And you can use functions like ST covers, ST intersects. And so for this example, if you want to know, for example, how many subway stations are in each borough of New York, then you would use the ST covers function to say, for each borough, how many station points does it contain? And you would just get a list by borough name and number of stations. And fairly recently, in Postious 2, Postious added a whole bunch of functions for dealing with 3D data. So this is interesting if, for example, you're doing wire cabling. So in those cases, the Z index becomes really important. You can't just treat line strings as 2D. So we have functions that will consider the Z index, the elevation. So these functions, instead of ST distance, you would use ST 3D distance, which considers not just x and y, but x, y, and z. And similarly, you'd have ST 3D d within, ST 3D intersects. And if you have the extended 3D, which is the Postious extension, Postious SF seagull, you get additional functions like ST 3D intersection, which will return a new geometry, a new 3D geometry, which is the intersection of, for example, two 3D polygons in the area that's shared between those two. And one thing that a lot of people don't know is that Postious supports raster, which a raster is pretty much a picture. And in a mathematical sense, you can think of it as an n-dimensional array matrix, an n-dimensional matrix. So you can have, for example, each band is a separate matrix. And you can align that matrix with a geometric location on Earth and then do things like intersections with geometry. So for example, if you got elevation data, which is usually expressed as raster, you can ask for a set of points, what is the elevation at each of those. And you would use, again, the ST intersects. But ST intersects has an overloaded function that takes a geometry and a raster. So you can actually figure out if geometry intersects with an elevation area. And then the ST value function is a raster function that, for a given geometric point, it will return the pixel value, which is the value on the specific band you want, in this case, band 1. And for elevation data, there's usually only one band. So the value would be the value in the band, which is usually in meters. So these ones are. So this would be elevation in meters for each of these locations. The other thing that's special about postures, which you'll rarely find in other spatial databases, is the ability to re-project in the database. And what re-projection means is that for 2D geometries and rasters, it's really just a grid. And in order to be able to say that this grid is the same as this grid, they have to share what's known as a spheroid, a datum, and the way you kind of squash the earth into a flat surface. And if they don't, you can't overlay them. So transform. So in order to overlay two disparate grids, you have to transform one to the other. And so there are lots of mathematical functions that allow you to do that. And the table in postures called spatial refsis basically tells the system this number is a unique identifier in the spatial refsis table. And it tells the system how to project from one to another to another projection. So if you had a table of geometries, for example, this one is in New York State Plain Feet. You can, and you had the geometries in longitude latitude, which is a common projection you get. You can project it to State Plain Feet if all your other data is in State Plain Feet. And in order to intersect two things, you need them to be all in the same projection. So both geometry, so geometry has a projection. Geography is, since it's feroidal, it doesn't need to be projected on a flat surface. So it generally doesn't. But you can convert geography to a geometry and geography to geometry. And you can also transform rasters because you can, for example, if you're doing a flyover and you're doing it in State Plain Feet to feed it so that it's standard across the US, you might want to convert it to longitude latitude projection. And so the functions in postious ST transform also work for raster. And it's a different set of functions. And raster is a little bit more complicated with the transformation because really the idea with rasters you're warping the pixels. So you're kind of changing the shape of them. So you have extra arguments that tell you what algorithm to use to warp the raster. And another common use case for postious is map tile generation. And you'll see that a lot in open street map data. And common tools that are used are tile mill, which is a desktop tool. And map nick, which has a set of APIs that can be plugged into Python and PHP and other things for doing tile generation. And postious has numerous output formats. So the most common one used is the GeoJSON format, which goes nicely with the PostGrace JSON format to create, for example, feature collections, which can easily be laid on mapping tools such as OpenLayers and Leflip, which are the most common ones. And then there's a new one in 2.5, which outputs data as a map box vector tile, which is a new output format that it's not an image. It's basically the vector data and all the features. And it's in a binary format, so it's very small. And it can be styled on the client side, however you want. And there's also a X3D function in postious, which is useful for 3D visualization. And there's also the 3D bounding box operator to intersect 3D geometries or see if they intersect in space. There's another extension called postious tiger geocoder that comes packaged with postious, which uses the tiger census data for geocoding all of the US. And it uses street center lines. And the first piece of it is the address standardization, which will take an address and split it into its component parts. So for example, if you have an address like 29 Fort Green Place, it'll break it out into city state zip and the house number, which is what you'd need to be able to align it with street center line data to determine where it is along the street. And once you have your address standardized, then you can geocode it. So the geocode function internally standardizes it first and then geocodes it. So here's an example where we just take an address and we return the best answer. In some cases, you could end up with 10 or 100 if for example, you gave the wrong zip code or you spelled the street name wrong. And there's also a reverse geocode function or reverse geocoding is basically saying if you have a longitude latitude, give an expression of that in English terms or Spanish terms or any terms that you want. So this will give in a point, it will try to give English description of what the address is. And for raster, not only can you use it for geospatial, you can also use it just to change pictures in your database. So for example, if you wanted to, if you had a folder of pictures that you wanted to load them in, you can load them in using this command line tool called raster2pgsql or you can use the feature in Postgrease that allows you to run an operating system command like LS to pull in the listing of the pictures and then you can convert the path to a reference to the raster. So this one is really just what's called an out-of-db raster in the sense that it's not gonna pull the raster in the database but it references the raster that's outside of the database. And even if the raster is outside of the database, you can still use the raster functions like for example you can get the width in pixels, the height in pixels, the number of bands. So this example tells us that there are three bands for each of the pictures which correspond to the RGB channels of the image. And even though it's still outside of the database, you can create resized versions of it. So in this example, we take all the pictures and we convert them to 25% of the original size and we output it back using the Postgrease large object support. So this basically, this function is a post just function that converts it to a PNG format and then we can export it out using the large object byte A and then so it's on the server in a folder in the TMP folder and then you can just pick it up. And so what you get is here's the original picture and here's the resized picture. You can also do things like change the pixel values of a raster so if you wanna change the coloring of a picture you can use the reclass function. So for example, this is the picture we had before and this is the picture after we change all the blacks to whites and the whites to blacks. And then the most commonly used raster function is the ST clip function and that one is used to take a portion of a raster and just take that area that's within a geographic region. So for example, if you have a picture, you can take the centroid of the bounding box of the raster and then you can buffer it so that you end up with a crop picture. So for example, this is the original picture and then this is after you crop it by the center and then 120 pixel. And then the more common use for raster is for analyzing environmental data. So for example, if you had a, elevation data and you have a set of roads and they're in the same projection, you can use several functions and posts just to get stats. One of the common ones is the summary stats function which will return min, max, count and mean. And you normally use this with the ST count function because if you have a road, it could overlap several tiles. And so you need to summarize those. So the way you can summarize it is, well, getting the min is easy, getting the max is easy, but to summarize the mean, you'd have to multiply it by the count of the pixels to get an accurate mean value which is over. Okay, so here's a rather non-traditional use of post just. So if you think of time ranges, the period, the range type was introduced in post grace 9.3 or thereabouts. And so that allows you to specify two points in time, one is the beginning, one is the end and that's the date range. So if you think about it, a line string is really a time continuum. So you can take advantage of post just to do a lot of the date time range operations that are not packaged with the period, with the time range data type itself. So let me just show you an example of that. The key thing is you have to map a point in time. In this case, we've chosen some point in time as a negative infinity. And then you have to map negative infinity in time to some point on a number line and then you have to pick infinity as some number on the number line. And then so this particular function converts your line string back to a date and then there is the other function which converts it to a date range from a geometry. So let me go back. Okay, so this one, this one, you start with the date range, it will convert it to a line string and then the one before will convert it back. So once you have these two functions, you can flip between the time dimension and the line dimension or the linear dimension. And then you can use all the functions and post just for your time computations. So here's an example of three date ranges. And I want to know, well, what do they collapse down to? So in this case, you have a date range that goes from 1970 to 80s and then you got something that's from 75 to 95. And then something that cuts in from 90 all the way to infinity. And I want to know, well, what is the total coverage of these three distinct date ranges I can just use the line merge function after I union them and dump them out and use my conversion back to the date range and I get my answer. This is if the date ranges are not adjacent, you could end up with multiple answers. This is not something that you can do in with the data type range data type right now because there's no function that will return more than one row right now. So, but you can take advantage of post just to take your date range and collapse it down to however many that we need to end up with. In this case, you have two. Am I unnesting an array in this function? I in the two date range, no, I would not say not strictly in the two date range. Let's go back to date range. No, I'm not, I am not, I avoid the unnesting of the array but I do do it later on where there's another function when I build a aggregate function, there is some unnesting involved. But this is simply taking advantage of the, it's the stdump function that will do the unnesting and that's the post just function that does the unnesting. Any more performance? Not necessarily more performance, but it works. And then this is, this one's all continued to give us and so everything collapses down to just one date range. And so once we have enough of these uses, we created an aggregate function out of it. So we can pass in a set of date ranges and then just, let me show you how to call it, we can pass in this set of date ranges and then use our collapse period function to collapse them instead of going through all the separate functions and postures. And then because we only wrote one that will return as an array, so we have to do a final unnest. No, because you're starting with a date range which is unprojected. So it, so there's not much relevance to projecting it. We're talking completely about a number line. Yeah, because if you have postures, you don't have to worry about, you can define your own coordinate system and it's just arbitrary. It could be start from anywhere to anywhere. And it's only if you need to align it with some place in space that you need to worry about projection. Okay, so we took this one step further. All right, this is a common case where we just have a lot of date ranges and we have in this case, test scores, but you could have stock prices, you could have a lot of anything with a date beginning and a date end. And it's just thrown at you and you want to say not only collapse the range down to what is minimally feasible, but also to find in this case, the highest grade that was given for these time series for each of the person taking the test. So this first step here, so these are the input ranges, you got this guy Alex and this lady Beth and these are the dates and these are the scores that they have on these dates. And so the first part of it will take all the distinct cuts that could result from just overlaying these low segments on top of each other. So let me see, I think the Beth example is probably a little bit easier to show. So from January one to January three, she has an A from five to nine, she has a B and then there's an overriding X from one to nine. That's if you weren't evaluated, then we should see an X during that time period. And then here is the results for Beth. So on the date that she's not evaluated, which is the third to the fifth, which we don't have any other score for, she falls down to an X. And then there's just one more step if you want to combine two contiguous scores like right here, because this could be collapsed down to one score that goes from the third all the way to the sixth. And that's just another step here where we pass it through what I've shown you before to collapse the dates. So now we're down to what is the minimal number of date ranges needed to specify the highest grade that each person received. And again, you can throw as many date ranges and scores at this, and it will just take care of it by running through the function. I've only given a small sample. So that's post just, and then we wrote a book on PG routing. PG routing is another add-on that works with post just which allows you to do routing. And let me just start you off with some examples. So traveling salesman problem, this is a classic resource optimization problem. You've got this salesperson that has to visit various towns to sell vacuum. And you want to find what is the shortest distance that the salesperson should take because you don't, because say he or she is driving the company car and you don't want that many miles on it, okay? So that's a traveling salesman problem. And after you install PG routing, the PG routing comes with algorithms to solve this automatically. Pretty much it's just one function. Once you set up your network, you just run it through the function, you get the result. And in this example, I pulled all the longitude latitude of nuclear power plants around the world, at least as of a few years ago. And I pretended that the salesperson is the IEA inspector who needs to go around to all these plants and check them out. So I have their coordinates and then I assign each power plant a unique ID. And then that's pretty much all I need to do for the input. And then I just run this TSP function. And out comes the results. So, can't load this up too much, but so the inspector starts somewhere off here in the Czech Republic, which is closest one to Geneva, which is where I think IAEA is based. And then he or she will go through Eastern Europe, go through Russia, go down to that broken one in Japan, and then make his or her way down Asia, fly over to South Africa, South America, and then back to the United States and then get back to Europe. So all this can be easily done. You know, now I don't know what magic goes into the TSP algorithm. I'm sure there's extensive literature on that, but to me, I got a problem. I write a function call and I got a solution. It seems plausible. So that's one use of PG routing. Another one is what we were going to call catchment areas. We made up that term when you write books, you can sort of make up your own. Well, it's not a made up term, but the way we're using it is kind of made up. Okay, yes, it's standard. Okay, so the idea is we took the OSM data from Washington, D.C. So that's got all the streets and where all the points of interest are restaurants or points at least people wanted to identify on OSM. It's got road networks too. And all the road networks. And then we have a, we have one fire station. We set up a fire station somewhere in the city. It's a real fire station. It's a real fire station. It's a real location. Real location, right. And then we want to say, well, based on five minutes, what are the areas that can be serviced by this particular fire station? And this is really the code to it. It's a, the key function here is PGR driving distance. And it's kind of a misnomer because your network doesn't have to involve driving. You could be flying, you could be biking. It doesn't really matter. It's really the network and the cost that you give it in the past. Right, so I mean, the cost. It might not even be time. You might be measuring like the cost of tolls or how hard it is to drive across the fuel that you're using. Right, fuel consumption. You can put in, you can have routes that are illegal, like illegal, going down a one-way street in the wrong direction where you can assign a big cost to it. Truck routes, things like that. So this is our resultant. This is our fire station here at the Little Star. And I think we picked up a pretty big station that has all sorts of trucks. So within five minutes, and this is Washington DC. And this fire station looks to be in the southwest, southeast quadrant of the city. And so it can reach, Washington DC is not that big. So it can reach this area and beyond if it needs to go into Virginia and so forth within five minute time. And then. But we ignored one-way, so it probably. Yeah, we ignored one-ways because fire trucks can go down one-way streets. Okay, and then another, a very common use is to use the Dijkstra algorithm, which is to find the best route between two different places. That's a little bit different from a traveling salesman problem, which is trying to minimize the total length of travel. So for this, we took the London subway system. Let me see if I can get to the site. We may not have internet, but so the London subway system is, as you can imagine, a complicated web of subway lines and stations. So we download that data and it comes natively from the government in longitude latitude. And once we have all the stations and the lines, it, we just did some cleanup because they have some stations that seems to be floating above ground but not much cleanup needed. And then we can start routing with the data that we prepared. And in this particular problem, I am trying to see what is the best way to get from Cheshire station to West Croydon. And this is the optimal path. This is the entire network. As you can see, it gets pretty hairy in inside center of London right here. So there's actually, you start here, you end up here. And then within it, you have to take multiple, you have to make multiple transfers to get to your destination. But again, it's as simple as this function. And most of the time it's spent downloading and preparing the data. And then it's just calling the function. And there are multiple variants of Dijkstra. Like for, there's one where you can specify all the points, the order you want to go and it will give you the itinerary. And that's usually used with TRSP because you first use TRSP to figure out the order you want to travel. And then you use Dijkstra to figure out the actual path to go. Okay. And so, that's it for the presentation part of it. Here are some links. And oh, this is where you can go to buy our book. That we have one on, we wrote one on PostGIS in its second edition. And then we have one on PostGRACE. And we're working on the third edition. Which should be out in another month or two. You can actually buy it now. It's just, we have eight of the 10 chapters. We're still working on the two chapters but they'll give you the electronic. And then the final version when it comes out. And then the PG routing, that's- That you can also buy too. That you can also buy- It's just not in hard copy. Not in hard copy. We just turn it in. But many people have already bought it. Even without having, we like how modern day publishing works. You can sort of give people piecemeal chapters and somehow they'll still buy. Okay, so that's it for the presentation. And we'll take questions if you have any. Yeah, there's a PG routing layer which supports, I didn't have a picture of it. Which supports all the routing algorithms that PG routing has. So as long as you have PG routing installed on your database, you can pick from the dropdown all the different algorithms and then use QGIS to pinpoint the points. You can always get data. Like one place that I like getting OSM data is, I remember, Geofabric. Actually it's probably easier to go to mine. We also have a GIS site which I've kind of embarrassed I haven't kept. Oh wait, we don't have internet. Oh I know. Yeah, so Geofabric I've found is the best. And they give you, they give you prepared downloads. So for example, if you want all of Africa or of North America, you can download a single file and load that into your postures database. And there are tools like Impossum and OSM to PG SQL. And then there's also OSM to PG routing which will take the data and prepare it for routing for you. So you don't even have to think about it. I think there's a whole bunch that are tied against OpenStreetMap. And ideally we'd want to create a geocoder against OpenStreetMap. We've just gotten around to it. Yeah. But that would be, but yeah there isn't any reason why you can't create a geocoder against OpenStreetMap data. Because it has all the street center lines. And a lot of the governments, they feed into OpenStreetMap their data. And a lot of it is just pulled in also, as in addition to being crowdsourced. Yeah, it's an approximate, yeah. So it uses, yeah, it uses heuristics. Heuristics, yes. And there's another one which is still kind of in beta, the VRP, the vehicle routing problem, which is kind of, it's more of an advanced problem to TSP. So if you have a fleet of vehicles and you have a schedule of customer that you need to visit, you need to visit customer A between eight and nine and customer B between 10 and 11, it will come up with, okay, you gotta send this vehicle here. This vehicle will visit these customers and this vehicle will visit this set of customers. And it also considers weight constraints. So if your truck can only hold 500 pounds, you can only load so much. That's, and so basic Uber ride sharing problem, you got a car and it can hold five people and here are the people who are hailing for you. So it will, you can determine, well, who do I pick up first and then where did I drop them off? And then it will let you see up to the number of people that you have. And you can pick up people from where you left the other one off. Yeah, so that's... There's a whole, it's gonna be more beefed up than in 2.6 and 2.5, it's already got some of it. 2.3 S. 2.3, yeah, it's gonna be beefed up and the whole family of functions is called... Vehicle routing. Vehicle routing problem, VRP. There are, there's some, you have to look at the different variants that's offered by Dijkstra, but I would not say that you can do much with the underlying algorithm, but it accept different scenarios, different type of inputs. Yeah, so the way PG routing works, they have a set of functions that they call proposed. They're really inside the version, but they haven't been super tested, so they haven't been officially sanctioned, but they're always in the extension, so you can try them out. So the VRP are considered proposed, but they'll be more official in 2.6 and there's gonna be multiple variants which allow you to say, I want this to finish in X amount of time. Don't take more than this amount of time to figure out which ones to do. I prefer having fewer vehicles versus more vehicles. Longer time. You can optimize over different metrics. So you can set those metrics. In the post just book, there is well, linear referencing is you sort of, we would use it whenever we end up with data that we need to sort of pin down to a... It's only in the post just book that we have that copy. It's not like a full chapter. It's not like a full chapter. It's like three examples. But it's a big book. Yeah, it's a separate page book. Yeah, so there's a set of functions called turn restriction functions. They're not, it's kind of a misnomer because it's used for other than turn restriction. So you can specify a sideline table and in that sideline, you could have traffic flow and apply additional costs. So you have your main cost table and then you can have these adjacent tables that change constantly and it would impute additional costs to certain routes and it's also, you can also impute costs in such a way that you say, you can't go from route A to route B because that's an illegal left turn and you can pose a super high cost on that. Or there's like an accident there, you can't go there. Right, you can't do U turns if you're on this route. You can't end up on this route and you're next. So the specification for cost is a little bit more sophisticated with the turn restriction. You would swap out, you would have to swap out the information in the side table but it is a side table so that makes it easy. It's not part of your core cost table. It doesn't affect your network. It's just something you add on. Well, because it's, well, we see it this way. If you have a million of these things, it's probably easier to just run it all on the database because the database, the SQL allows for bulk operation and so we're talking one function here rather than each piece. Right, or breaking it into separate rows and I think it would be, sometimes it may be easier to write a function if you consider individual but then once you get into the millions of rows, it gets slow. And especially since databases have, they changed the plan that they use based on how much data you have and how many people are using it and getting that data. So it's more efficient if you're gonna have lots of people hitting it, you're taking advantage of what people are using. Anybody else? Okay, so, our book. All right, thank you.