 Hello. Uh, this session is packing the data and getting it back out. Can everybody see in here? Fine. Okay. Well, this, uh, talk is about, uh, programming techniques to save time and space. Um, basically, you, uh, can get disk very cheap now, but even if a terabyte disk is really cheap, it doesn't mean getting a terabyte off of the, a single spindle is going to be anywhere near fast. Disk access takes time. Large data, larger data sets can be handled with the same old, you know, I'm using older hardware and, uh, data can be organized for quick access for common queries. Before you really try and, uh, use these type of techniques, it's nice to know your data, know your access patterns, and you want to think about what's the common case and optimize for that. But really, you shouldn't be, uh, afraid to experiment. And I've gone through several iterations of, uh, my, the program I'm using as an example and, uh, improved the, uh, format. The example data set I'm using is, uh, the open street map, uh, data. It, uh, consists of a weekly planet file, which is, uh, six gigabytes B zipped and over 120 gigabytes of XML after you expand it. And using a conventional XML database, or SQL database, uh, by the time, you know, despite the fact that XML is an extremely inefficient way of storing things, it winds up SQL is even less efficient and you winds up using about twice as much disk space as the uncompressed XML is. And there were some problems, uh, originally, uh, applying all the, the change, minutely changes to, uh, the SQL database. Um, they eventually got worked out, but that was true at one point in time. The open street map data consists of change sets, which I'm going to ignore for the purpose of this talk because they really, their, their, how the data got to this current state rather than, uh, what the current state is and all I'm care about is what the current state is. Nodes, which are locations, well, okay, nodes, ways and relations. A node has a latitude, longitude version and an optional set of tags. I'll explain what tags are later. Ways have version, list of nodes and tags. Ways, nodes are used for either as parts of ways or, or relations or else just a point of interest like a shop or a stop light or even a tree. Ways are, uh, ordered lists of, of, uh, nodes and, uh, can, are used for things like streets or lakes where, uh, uh, they use a ordered list of nodes and, uh, because it's naturally equal water, they know it's an area. Relations have version members and tags and members can be nodes, ways or, or relations and each member also has a role. Um, what they're used, they're used for things like turn restrictions and other things. They, they get fairly complex and exactly what they're used for really doesn't apply to this talk. All, all of those pieces of data have tags which are a key and a value, both of which are free format UTF strings. So, you know, a way can have a highway equal steps to re, to indicate that it's a set of, uh, it's a staircase. You know, some, some keys are com, are, you know, keys are unique within the node way in relation. You can't have two highway keys on the same way. Some strings are common, others are rare. And one of the problems with, with using free form strings is you, you'll get typos into your database. There's a number of hig ways, uh, and so forth. But it allows for, for easy expansion and you can create your own new data. If nobody has a thing for a street light, you can do it a street light. Well, actually there is a street light thing right, already. But you can, uh, you know, if, if there's something you think should be on the map, you can add it to the map without having to go through a complicated approval process. Now if you want it to be used much, you do have to get people agreeing on what it is. And roles are a simple UTF 8 string. And most, mostly they're, they're actually empty strings, but, okay. Here's an example node. The node ID, you know, 108331, there's just a unique identifier. Uh, there, the, the, it's a unique identifier for node, but there can be a way with the same number. There can be a relation with the same number. Version is f, equals five, which means this, uh, particular node has been updated four times since it was originally created. Time stamp is the time stamp of the, the last time it was updated. The UID and user are, are who did the last update. And the change set is the change set that, that this, uh, update was part of. Um, all of those are actually being ignored in my example because they're, they don't really apply. Uh, it has a, the latitude is 51.496, whatever the longitude is minus 0.13. So it has a location, and it has a direction. Why a node will have a direction? I am, oh, it's a mini roundabout, a clockwise mini roundabout, uh, so it's probably in the UK where they use just a painted circle in the middle of a street intersection to say, treat this intersection as a roundabout. Okay. Here's an example way. This one's a particularly simple one because it only has two nodes and, you know, and one tag. But I, I chose simple ones because it fit on the slides better. And here's an example relation. And this winds up being a simple relation, but it was difficult to, to get it to fit on the slide. And this one happens to be a, a turn restriction. So it's indicating you can only make a right turn at this particular intersection. Okay. Um, now I'm going to talk about some, some pseudo data types that I wound up defining in order to pack this data in. Um, because the num, a lot of the numbers that I was using wound up, well, they can be fairly, fairly large, but still the numbers closer to zero are a lot more common. And so I decided to use a variable length encoding scheme where I'm using, uh, seven bits per byte on the, uh, to, for, for the actual number and one bit and one bit and one bit as a continuation bit. So for, in one byte I can store a number between zero and 127 or two to the seventh minus one. Two bytes I can store, store 128 through 16,300 and whatever. It winds up being slightly more than two to the 14th because as a two, two byte thing I'm not duplicating the possibilities that can be stored in a single byte. So I have an extra 128 possibilities than the two to the, it, it's a very minor optimization. And, you know, with two, with, uh, three bytes I can store up to slightly more than two to the 21st, et cetera. Um, this wound up, wound up working quite nicely. Um, and I stored it a little Indian because it wound up the program, it, it was much easier to program the access for these numbers little Indian than it was big Indian. And here's a little pearl routine I have for, for storing one of these, uh, venums. For, I mean, if you want to access this later, I did, uh, up as of about an hour ago I uploaded the slides on, on my talk. So you should be able to access them. And here's a routine to fetch the venum, which actually, uh, reads it from a file because I was always reading them from a file. The, the store I was actually doing to a string and then manipulating it further before I wrote it to the file. And another common data structure that I wound up doing was, you know, like, the, the tags, there are a lot of common tags and then there are some really uncommon ones. And, so storing it as a string wound, winds up with a lot of duplication, but storing it as an index to a table winds up. Well, gee, somebody else created something new and do I have to create a new table entry and so on and so forth. And so what I wound up doing was using a single value to indicate that the string follows. And so, I'm going to do, you know, the uncommon strings are indicated by a zero followed by the, uh, actual string followed by a null to terminate the string. And for the actual common strings, I'm using a venum to store the string. And so I sort the most common strings have the lower numbers. And so the most common 128 possibilities will be stored in one byte. The next most, you know, the next, uh, 16,000 in some odd will take two bytes to store and so on and so forth. I wound up being a quite efficient way of, uh, handling this and still have full possibility of, you know, you can type in, you know, you can use anything you want. I'm using a hash for, for encoding the string and an array for decoding the string. I assume everybody knows how to, how to do those. I, to, to get the, uh, set of common strings, I run a program over the current set of the database and do, and create the, the common string table. And once, you know, once I do that, I, I'll use that for a while and then we'll run that again to get a new version to, to compress it further, you know, because the most common strings actually wind up changing over time. And I set, use a single venum in the beginning of each data file to, uh, store which version of, of the table. So I don't, so I don't have to update the database all at once. And I use separate tables of common strings for each key, each type in key. It's getting cut off at the bottom of the slide here. Okay, I actually store a node as a size, which is a venum ID, which is, uh, actually, I'm actually using a standard, uh, 32-bit number for that because there are more than 2 to the 20, 2 to the 28th, I believe it was that I, I just determined that on average the size would be bigger to use a venum because of the number of 5-byte things to store a full 32-bit. Then, uh, just using a 4-byte string, although in, you know, when, when the, uh, number of nodes in the, uh, open street map quadruples from what it is now, we'll have to think about, you know, going to venums because, uh, it'll be more than a 32-bit integer can handle. But before that, I'll have some problems with, uh, trying to, you know, long before that I'll have some problems, uh, with Perl and, and it doesn't seem to have any option for, uh, handling seek on a very large file. The latitude and longitude I'm storing is integers, 32-bit integers. What I'm actually storing is the latitude times 10,000 and the longitude times, I roll not 10,000, I think it's 100,000 or something like that. But anyway, it's, so I'm using almost all of the 32-bits, but storing it as an integer. Uh, if I, if I used a floating point number, I wouldn't have enough accuracy for latitudes and longitudes further from the equator. And the way the data work, you know, we don't need to more accuracy close to zero. It's just, zero is pretty arbitrary on latitude and longitude. The version, which was, you know, just the version number from the, from the, uh, data and a set of tags, which, you know, use a key followed by a value, both of which are the common string. And I indicate that the, the end of tags by, you know, using the size at the beginning, so I know, so I don't have to store any separate number of tags or so on and so forth. It winds up being a little bit complicated to, you know, if I want to figure, you know, I can't just index, index into the tags, but I generally don't need to do that. Ways are stored fairly similarly because there are not nearly as many ways as there are nodes. I'm using a Venom to store the IDs on ways. Um, the nodes, node list is stored by, as a number of nodes followed by the list of nodes. Actually, okay, I should have used unsigned on the node ID here to match what I did above, but I'm basically, you know, just still a 32 bit integer. And relations are a little bit more complicated. Uh, and there's the size ID version. And a list of members which, uh, the member type is, you know, 0 for, for the la, indicating that this is the last member. Uh, one for node, two for way, three for relation. I'm using a full byte to store two bits of information, but I'm not really trying to get the every last bit in out. Um, the member ID which is Venom for, uh, ways and relations or, or a 32 bit integer for nodes. And the role is a common string. And tags are handled like the others. Okay. On, on the desk, uh, I'm organizing the data by a tile, where the tile is a group of data in, in a, in an area of latitude and longitude. And we, we do the, on open street map we do projection for do, Mercad, Cotter projection for doing the maps. And so I'm using the projection, uh, subroutines we already had available or slightly modified versions thereof. Um, areas with more, uh, nodes are stored in more tiles. I'm not actually just using one size of tile for all of the world. In areas that are almost all C, there are very few, there's very little data of interest. And then in the central, uh, center of London there's a very, a lot of data. And so I, to, to equalize somewhat the, uh, size of the groups of data, I'm, uh, I'm changing what I'm using. To, to delete an object, I just change the ID to zero. And so when I scan through the list of the, the file with nodes, I'll say, you know, first read the size, then the read the ID. If the ID is zero, I know just to, to skip the, over that many bytes of information to get to the next one. Uh, I mean, using, uh, what they call Zoom 11, which is, splits the world into, into two to the 11th. Uh, so 2000, 2048, uh, let, let, groups of latitude in 2048, groups of longitude. So for, for, you need 4 million tiles to store the entire world at Zoom 11. And then I'm going up to Zoom 16 for, uh, high density areas. So we're talking about millions of tiny little files here. And I, um, for millions of, of tiny little files that are, you know, averaging less than two kilobytes each. Uh, I wound up, uh, needing to tune the file system as you're probably not surprised. Um, and I wound up being EXT from the experimentation that was done. EXT 3 actually wound up being one of the better file systems to use. EXT 4 had some advantages but it wasn't stable enough at the time the, the testing was done. And to, uh, since I don't want to have, uh, 32 million files or whatever in a single directory, I'm using the least significant bits of the, uh, both the tile X and tile Y as, you know, to decide which directory to store the file in. You know, from a latitude and longitude you can get a tile, uh, you know, latitude, longitude, and Zoom you can get a tile number. But if you just have the latitude and longitude you have to figure out what Zoom that date, piece of data will be in. And so I have an index of which, which tile is at which Zoom. Because I'm splitting, you know, when a Zoom 11 tile gets more than so many nodes in it, it'll be split into four Zoom 12 tiles. I, I have to have an index to indicate what, uh, Zoom H particular L is. So I can compute which tile it would be at Zoom 15. And because Zoom 16 is the maximum I'm using, I, I don't have to store the, uh, at the full Zoom that I, because if a, if I had a Zoom 15 tile I indicate that this, this tile is stored at Zoom 16, well all, all the, all four tiles that are part of that Zoom 15 tile are stored at Zoom 16. I'm storing the Zoom level as a single byte despite the fact that that's inefficient again. It's actually about three bits worth of data. And so, uh, the Zoom index winds up taking a gigabyte. It's a slightly sparse file, but not very. So a little bit of space is saved by, uh, the fact that, uh, I'm using zero to store that the tile is at Zoom 11. And the fact that if you have so many consecutive zeros in a file and, and don't actually write the, the zero to the file, it actually, it, uh, Linux will not use the disk space to store that set of zeros. Okay. For all the other indexes, I'm, I'm just storing the Zoom 16 tile number. And I can use the Zoom 16 tile number to figure out what the, uh, Zoom, the Zoom is for that particular tile. And I use the, uh, ID as the index into the file. I multiply it by four because it takes four bytes to store the 16 bits X and 16 bit Y of each particular tile. Okay. Um, updating, uh, deletes are handled by, by making the ID zeros I mentioned before. Updates are handled by zeroing the ID and adding the new entry to the end of the file. Um, that means that there's garbage, uh, created. And I, I'm, keep a, an, a, uh, live, live, live, live, uh, list of the tiles that need to be garbage collected and will garbage collect a certain, up, up to a certain number of tiles per minute. So I, I, when, when the, uh, updating is very busy, it winds up having a backlog of stuff to garbage collect. And then, when, during the slow times, the, uh, the garbage will actually be collected. And fairly frequently, a, uh, once one update is done to a tile, another update will be done to the tile, now the next minute. So I don't, so if I keep garbage collecting the same tile over and over every minute, it's kind of inefficient. And when I, when I create a new node, uh, tile is split between, into four tiles of the next zoom level, if, uh, if there's more than four kilobots of data, data in that, uh, node file. And of course it's not already at zoom 16. Okay. Um, I pre, on updating, I process all the node creates, then way creates, then relation creates, then relation deletes, way deletes, and then node deletes. So that way, I don't have inconsistent data. Because if you deleted the node before the way got deleted, it creates inconsistent data. That's all I have prepared. Uh, are there any questions? Is there an audience mic available? Well, it's, it's being recorded. So the, all right. One, test. Okay. I can't hear anything. Okay. Yeah. So you say about, uh, talk about packing and unpacking. Uh, is the data, uh, can you restore the actual planet, planet database from that format? Or is it loss, lossy? Because you convert these, uh, floating points, coordinates to integers? I don't have full accuracy on the, the, uh, latitude and longitude, but it winds up that, that the way they're storing it in SQL is basically the same technique. And so, I'm not losing any more data than, than has already been lost. Uh, I am losing the change set information and the user information. Okay. So I cannot download data to the editor from that server and then later upload it to the main server. Um, the version, okay, uh, one, one thing that I should have been clear about is the version, this is the version of Trappy that I'm working on rather than the version that's currently, uh, live available. The version that currently live, that's live and available is not storing the version number. And it also is filtering the tags based on a lot, it was primarily, it's primarily used by the tiles at home, uh, data renderer. There's a lot of, a lot of common tags that, that are just not of interest. And before I figured out how to store these tags as completely as I have now, uh, it was taking a lot of disk space just to store created by tags and so on and so forth. Okay. But it, it would be really interesting to have a real, uh, mirror for the real database because at the moment if the real database is down you pretty much cannot do anything. Yes. Um, it, this, this, uh, should in theory be able to handle the, uh, storing enough data to do the edits. However, the current, the, there's a problem of the hourly or the, the minutely change files are not reliably capturing all of the changes. And unfortunately, I can't do it if, if the data isn't being passed to me. Okay. I didn't know about that. Yeah. It's a problem of, uh, when the change set spans too long of a time the way the minutely, minute changes are generated. You can have a way and it doesn't have all, it misses some of the node creates that are part of that way. And yeah, it's, it's a known problem and it is being worked on, but I'm not sure when that's going to be fixed. Actually I'm winding up using the minutely change files that are a half hour behind which have fewer of those problems, but it still winds up that, that there's still some missing, uh, data. So I, we can use it for, for rendering and so on and, and things that don't have to have 100% of the data. But unfortunately it is still not reliable and that's not. Okay. Yeah. Thanks. Okay. Any other questions? Oh, I suppose I should have mentioned things like, uh, the, you know, the, the original version of Trappy, uh, that, that did the tag filtering was winding up, uh, storing the entire, uh, database in about 15 gigabytes of disk space. Um, the current version of Trappy that I'm running, that I'm running at home but is not running live anywhere else yet is, uh, storing in about, uh, I think it's something like 22 gigabytes. So it's certainly, it's not as an efficient format as the B-Zipped, uh, file but it's a lot more randomly accessible and because it's indexed by the tile and most of the data, the, the data queries we're getting are by, uh, location. I don't give me all the data around this area. I'm actually returning too much data because I'm rounding up the equate, request to, uh, the, the tiles which is one of the reasons that I'm keeping the tile files so, so small is so I don't retain, return way too much data because, uh, the network transmission costs can, you know, the, the time and, and, uh, there was a, there was a small, small period of time when, when, uh, one of the Trappy servers had was paying per, uh, traffic volume and it wound up being way too expensive even, even the way I was doing it. Oh, um, a, a lot of these type of programming techniques are really pretty old but most, most of the current crop of, uh, programmers don't seem to be taught them because the, you know, the, uh, you know, disk is cheap and, and, and, uh, memory is cheap so why do you need these? Well, if you, in a lot of cases you do need them and, um, embedded programmers are obviously quite familiar with, uh, packing data into small spaces. Okay. Well, if there's no more questions, uh, thank you.