-
-
Notifications
You must be signed in to change notification settings - Fork 221
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Improve MBTiles documentation (#961)
Fixes #918 --------- Co-authored-by: Yuri Astrakhan <[email protected]>
- Loading branch information
1 parent
3a1f7ac
commit c6170c5
Showing
7 changed files
with
213 additions
and
135 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,139 +1,12 @@ | ||
# Tools | ||
# CLI Tools | ||
|
||
Martin has a few additional tools that can be used to interact with the data. | ||
Martin project contains additional tooling to help manage the data servable with Martin tile server. | ||
|
||
## MBTiles tool | ||
A small utility that allows users to interact with the `*.mbtiles` files from the command line. Use `mbtiles --help` to see a list of available commands, and `mbtiles <command> --help` to see help for a specific command. | ||
## `mbtiles` | ||
`mbtiles` is a small utility to interact with the `*.mbtiles` files from the command line. It allows users to examine, copy, validate, compare, and apply diffs between them. | ||
|
||
This tool can be installed by compiling the latest released version with `cargo install mbtiles`, or by downloading a pre-built binary from the [releases page](https://github.com/maplibre/martin/releases/latest). | ||
|
||
### meta-all | ||
Print all metadata values to stdout, as well as the results of tile detection. The format of the values printed is not stable, and should only be used for visual inspection. | ||
|
||
```shell | ||
mbtiles meta-all my_file.mbtiles | ||
``` | ||
|
||
### meta-get | ||
Retrieve raw metadata value by its name. The value is printed to stdout without any modifications. For example, to get the `description` value from an mbtiles file: | ||
|
||
```shell | ||
mbtiles meta-get my_file.mbtiles description | ||
``` | ||
|
||
### meta-set | ||
Set metadata value by its name, or delete the key if no value is supplied. For example, to set the `description` value to `A vector tile dataset`: | ||
|
||
```shell | ||
mbtiles meta-set my_file.mbtiles description "A vector tile dataset" | ||
``` | ||
|
||
### copy | ||
Copy an mbtiles file, optionally filtering its content by zoom levels. | ||
|
||
```shell | ||
mbtiles copy src_file.mbtiles dst_file.mbtiles \ | ||
--min-zoom 0 --max-zoom 10 | ||
``` | ||
|
||
Copy command can also be used to compare two mbtiles files and generate a delta (diff) file. The diff file can be applied to the `src_file.mbtiles` elsewhere, to avoid copying/transmitting the entire modified dataset. The delta file will contain all tiles that are different between the two files (modifications, insertions, and deletions as `NULL` values), for both the tile and metadata tables. | ||
|
||
There is one exception: `agg_tiles_hash` metadata value will be renamed to `agg_tiles_hash_in_diff`, and a new `agg_tiles_hash` will be generated for the diff file itself. This is done to avoid confusion when applying the diff file to the original file, as the `agg_tiles_hash` value will be different after the diff is applied. The `apply-diff` command will automatically rename the `agg_tiles_hash_in_diff` value back to `agg_tiles_hash` when applying the diff. | ||
|
||
```shell | ||
mbtiles copy src_file.mbtiles diff_file.mbtiles \ | ||
--diff-with-file modified_file.mbtiles | ||
``` | ||
|
||
This command can also be used to generate files of different [supported schema](##supported-schema). | ||
```shell | ||
mbtiles copy normalized.mbtiles dst.mbtiles \ | ||
--dst-mbttype flat-with-hash | ||
``` | ||
### apply-diff | ||
Apply the diff file generated from `copy` command above to an mbtiles file. The diff file can be applied to the `src_file.mbtiles` elsewhere, to avoid copying/transmitting the entire modified dataset. | ||
|
||
Note that the `agg_tiles_hash_in_diff` metadata value will be renamed to `agg_tiles_hash` when applying the diff. This is done to avoid confusion when applying the diff file to the original file, as the `agg_tiles_hash` value will be different after the diff is applied. | ||
Use `mbtiles --help` to see a list of available commands, and `mbtiles <command> --help` to see help for a specific command. | ||
|
||
```shell | ||
mbtiles apply_diff src_file.mbtiles diff_file.mbtiles | ||
``` | ||
|
||
Another way to apply the diff is to use the `sqlite3` command line tool directly. This SQL will delete all tiles from `src_file.mbtiles` that are set to `NULL` in `diff_file.mbtiles`, and then insert or update all new tiles from `diff_file.mbtiles` into `src_file.mbtiles`, where both files are of `flat` type. The name of the diff file is passed as a query parameter to the sqlite3 command line tool, and then used in the SQL statements. | ||
```shell | ||
sqlite3 src_file.mbtiles \ | ||
-bail \ | ||
-cmd ".parameter set @diffDbFilename diff_file.mbtiles" \ | ||
"ATTACH DATABASE @diffDbFilename AS diffDb;" \ | ||
"DELETE FROM tiles WHERE (zoom_level, tile_column, tile_row) IN (SELECT zoom_level, tile_column, tile_row FROM diffDb.tiles WHERE tile_data ISNULL);" \ | ||
"INSERT OR REPLACE INTO tiles (zoom_level, tile_column, tile_row, tile_data) SELECT * FROM diffDb.tiles WHERE tile_data NOTNULL;" | ||
``` | ||
|
||
### validate | ||
If the `.mbtiles` file is of `flat_with_hash` or `normalized` type, then verify that the data stored in columns `tile_hash` and `tile_id` respectively are MD5 hashes of the `tile_data` column. | ||
```shell | ||
mbtiles validate src_file.mbtiles | ||
``` | ||
|
||
## Content Validation | ||
The original [MBTiles specification](https://github.com/mapbox/mbtiles-spec#readme) does not provide any guarantees for the content of the tile data in MBTiles. This tool adds a few additional conventions to ensure that the content of the tile data is valid. | ||
|
||
A typical Normalized schema generated by tools like [tilelive-copy](https://github.com/mapbox/TileLive#bintilelive-copy) use MD5 hash in the `tile_id` column. The Martin's `mbtiles` tool can use this hash to verify the content of each tile. We also define a new `flat-with-hash` schema that stores the hash and tile data in the same table. This schema is more efficient than the `normalized` schema when data has no duplicate tiles (see below). Per tile validation is not available for `flat` schema. | ||
|
||
Per-tile validation will catch individual invalid tiles, but it will not detect overall datastore corruption (e.g. missing tiles or tiles that shouldn't exist, or tiles with incorrect z/x/y values). | ||
For that, Martin `mbtiles` tool defines a new metadata value called `agg_tiles_hash`. The value is computed by hashing `cast(zoom_level AS text), cast(tile_column AS text), cast(tile_row AS text), tile_data` combined for all rows in the `tiles` table/view, ordered by z,x,y. | ||
In case there are no rows or all are NULL, the hash value of an empty string is used. Note that SQLite allows any value type to be stored as in any column, so if `tile_data` accidentally contains non-blob/text/null value, validation will fail. | ||
|
||
The `mbtiles` tool will compute `agg_tiles_hash` value when copying or validating mbtiles files. | ||
|
||
## Supported Schema | ||
The `mbtiles` tool supports three different kinds of schema for `tiles` data in `.mbtiles` files. See also the original [specification](https://github.com/mapbox/mbtiles-spec#readme). | ||
|
||
### flat | ||
```sql, ignore | ||
CREATE TABLE tiles (zoom_level integer, tile_column integer, tile_row integer, tile_data blob); | ||
CREATE UNIQUE INDEX tile_index on tiles (zoom_level, tile_column, tile_row); | ||
``` | ||
|
||
### flat-with-hash | ||
```sql, ignore | ||
CREATE TABLE tiles_with_hash ( | ||
zoom_level integer NOT NULL, | ||
tile_column integer NOT NULL, | ||
tile_row integer NOT NULL, | ||
tile_data blob, | ||
tile_hash text); | ||
CREATE UNIQUE INDEX tiles_with_hash_index on tiles_with_hash (zoom_level, tile_column, tile_row); | ||
CREATE VIEW tiles AS SELECT zoom_level, tile_column, tile_row, tile_data FROM tiles_with_hash; | ||
``` | ||
|
||
### normalized | ||
```sql, ignore | ||
CREATE TABLE map (zoom_level INTEGER, tile_column INTEGER, tile_row INTEGER, tile_id TEXT); | ||
CREATE UNIQUE INDEX map_index ON map (zoom_level, tile_column, tile_row); | ||
CREATE TABLE images (tile_id text, tile_data blob); | ||
CREATE UNIQUE INDEX images_id ON images (tile_id); | ||
CREATE VIEW tiles AS | ||
SELECT | ||
map.zoom_level AS zoom_level, | ||
map.tile_column AS tile_column, | ||
map.tile_row AS tile_row, | ||
images.tile_data AS tile_data | ||
FROM map | ||
JOIN images ON images.tile_id = map.tile_id; | ||
``` | ||
|
||
Optionally, `.mbtiles` files with `normalized` schema can include a `tiles_with_hash` view: | ||
|
||
```sql, ignore | ||
CREATE VIEW tiles_with_hash AS | ||
SELECT | ||
map.zoom_level AS zoom_level, | ||
map.tile_column AS tile_column, | ||
map.tile_row AS tile_row, | ||
images.tile_data AS tile_data, | ||
images.tile_id AS tile_hash | ||
FROM map LEFT JOIN images ON map.tile_id = images.tile_id; | ||
``` | ||
This tool can be installed by compiling the latest released version with `cargo install mbtiles`, or by downloading a pre-built binary from the [releases page](https://github.com/maplibre/martin/releases/latest). | ||
|
||
**__Note:__** All `normalized` files created by the `mbtiles` tool will contain this view. | ||
The `mbtiles` utility builds on top of the [MBTiles specification](https://github.com/mapbox/mbtiles-spec). It adds a few additional conventions to ensure that the content of the tile data is valid, and can be used for reliable diffing and patching of the tilesets. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,22 @@ | ||
# `mbtiles` Metadata Access | ||
|
||
## meta-all | ||
Print all metadata values to stdout, as well as the results of tile detection. The format of the values printed is not stable, and should only be used for visual inspection. | ||
|
||
```shell | ||
mbtiles meta-all my_file.mbtiles | ||
``` | ||
|
||
## meta-get | ||
Retrieve raw metadata value by its name. The value is printed to stdout without any modifications. For example, to get the `description` value from an mbtiles file: | ||
|
||
```shell | ||
mbtiles meta-get my_file.mbtiles description | ||
``` | ||
|
||
## meta-set | ||
Set metadata value by its name, or delete the key if no value is supplied. For example, to set the `description` value to `A vector tile dataset`: | ||
|
||
```shell | ||
mbtiles meta-set my_file.mbtiles description "A vector tile dataset" | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,57 @@ | ||
# Copying, Diffing, and Patching MBTiles | ||
|
||
## `mbtiles copy` | ||
Copy command copies an mbtiles file, optionally filtering its content by zoom levels. | ||
|
||
```shell | ||
mbtiles copy src_file.mbtiles dst_file.mbtiles \ | ||
--min-zoom 0 --max-zoom 10 | ||
``` | ||
|
||
This command can also be used to generate files of different [supported schema](##supported-schema). | ||
|
||
```shell | ||
mbtiles copy normalized.mbtiles dst.mbtiles \ | ||
--dst-mbttype flat-with-hash | ||
``` | ||
|
||
## `mbtiles copy --diff-with-file` | ||
Copy command can also be used to compare two mbtiles files and generate a delta (diff) file. The diff file can be applied to the `src_file.mbtiles` elsewhere, to avoid copying/transmitting the entire modified dataset. The delta file will contain all tiles that are different between the two files (modifications, insertions, and deletions as `NULL` values), for both the tile and metadata tables. | ||
|
||
There is one exception: `agg_tiles_hash` metadata value will be renamed to `agg_tiles_hash_in_diff`, and a new `agg_tiles_hash` will be generated for the diff file itself. This is done to avoid confusion when applying the diff file to the original file, as the `agg_tiles_hash` value will be different after the diff is applied. The `apply-diff` command will automatically rename the `agg_tiles_hash_in_diff` value back to `agg_tiles_hash` when applying the diff. | ||
|
||
```shell | ||
mbtiles copy src_file.mbtiles diff_file.mbtiles \ | ||
--diff-with-file modified_file.mbtiles | ||
``` | ||
|
||
## `mbtiles copy --apply-patch` | ||
|
||
Copy a source file to destination while also applying the diff file generated by `copy --diff-with-file` command above to the destination mbtiles file. This allows safer application of the diff file, as the source file is not modified. | ||
|
||
```shell | ||
mbtiles copy src_file.mbtiles dst_file.mbtiles \ | ||
--apply-patch diff_file.mbtiles | ||
``` | ||
|
||
## `mbtiles apply-patch` | ||
|
||
Apply the diff file generated from `copy` command above to an mbtiles file. The diff file can be applied to the `src_file.mbtiles` elsewhere, to avoid copying/transmitting the entire modified dataset. | ||
|
||
Note that the `agg_tiles_hash_in_diff` metadata value will be renamed to `agg_tiles_hash` when applying the diff. This is done to avoid confusion when applying the diff file to the original file, as the `agg_tiles_hash` value will be different after the diff is applied. | ||
|
||
```shell | ||
mbtiles apply_diff src_file.mbtiles diff_file.mbtiles | ||
``` | ||
|
||
#### Applying diff with SQLite | ||
Another way to apply the diff is to use the `sqlite3` command line tool directly. This SQL will delete all tiles from `src_file.mbtiles` that are set to `NULL` in `diff_file.mbtiles`, and then insert or update all new tiles from `diff_file.mbtiles` into `src_file.mbtiles`, where both files are of `flat` type. The name of the diff file is passed as a query parameter to the sqlite3 command line tool, and then used in the SQL statements. Note that this does not update the `agg_tiles_hash` metadata value, so it will be incorrect after the diff is applied. | ||
|
||
```shell | ||
sqlite3 src_file.mbtiles \ | ||
-bail \ | ||
-cmd ".parameter set @diffDbFilename diff_file.mbtiles" \ | ||
"ATTACH DATABASE @diffDbFilename AS diffDb;" \ | ||
"DELETE FROM tiles WHERE (zoom_level, tile_column, tile_row) IN (SELECT zoom_level, tile_column, tile_row FROM diffDb.tiles WHERE tile_data ISNULL);" \ | ||
"INSERT OR REPLACE INTO tiles (zoom_level, tile_column, tile_row, tile_data) SELECT * FROM diffDb.tiles WHERE tile_data NOTNULL;" | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,38 @@ | ||
# MBTiles Validation | ||
|
||
The original [MBTiles specification](https://github.com/mapbox/mbtiles-spec#readme) does not provide any guarantees for the content of the tile data in MBTiles. `mbtiles validate` assumes a few additional conventions and uses them to ensure that the content of the tile data is valid performing several validation steps. If the file is not valid, the command will print an error message and exit with a non-zero exit code. | ||
|
||
```shell | ||
mbtiles validate src_file.mbtiles | ||
``` | ||
|
||
## SQLite Integrity check | ||
The `validate` command will run `PRAGMA integrity_check` on the file, and will fail if the result is not `ok`. The `--integrity-check` flag can be used to disable this check, or to make it more thorow with `full` value. Default is `quick`. | ||
|
||
## Schema check | ||
The `validate` command will verify that the `tiles` table/view exists, and that it has the expected columns and indexes. It will also verify that the `metadata` table/view exists, and that it has the expected columns and indexes. | ||
|
||
## Per-tile validation | ||
If the `.mbtiles` file uses [flat_with_hash](54-mbtiles-schema.md#flat-with-hash) or [normalized](54-mbtiles-schema.md#normalized) schema, the `validate` command will verify that the MD5 hash of the `tile_data` column matches the `tile_hash` or `tile_id` columns (depending on the schema). | ||
|
||
A typical Normalized schema generated by tools like [tilelive-copy](https://github.com/mapbox/TileLive#bintilelive-copy) use MD5 hash in the `tile_id` column. The Martin's `mbtiles` tool can use this hash to verify the content of each tile. We also define a new [flat-with-hash](54-mbtiles-schema.md#flat-with-hash) schema that stores the hash and tile data in the same table, allowing per-tile validation without the multiple table layout. | ||
|
||
Per-tile validation is not available for the `flat` schema, and will be skipped. | ||
|
||
## Aggregate Content Validation | ||
|
||
Per-tile validation will catch individual tile corruption, but it will not detect overall datastore corruption such as missing tiles, tiles that should not exist, or tiles with incorrect z/x/y values. For that, the `mbtiles` tool defines a new metadata value called `agg_tiles_hash`. | ||
|
||
The value is computed by hashing the combined value for all rows in the `tiles` table/view, ordered by z,x,y. The value is computed using the following SQL expression, which uses a custom `md5_concat_hex` function from [sqlite-hashes crate](https://crates.io/crates/sqlite-hashes): | ||
|
||
```sql, ignore | ||
md5_concat_hex( | ||
CAST(zoom_level AS TEXT), | ||
CAST(tile_column AS TEXT), | ||
CAST(tile_row AS TEXT), | ||
tile_data) | ||
``` | ||
|
||
In case there are no rows or all are NULL, the hash value of an empty string is used. Note that SQLite allows any value type to be stored as in any column, so if `tile_data` accidentally contains non-blob/text/null value, validation will fail. | ||
|
||
The `mbtiles` tool will compute `agg_tiles_hash` value when copying or validating mbtiles files. Use `--update-agg-tiles-hash` to force the value to be updated, even if it is incorrect or does not exist. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,80 @@ | ||
# MBTiles Schemas | ||
The `mbtiles` tool builds on top of the original [MBTiles specification](https://github.com/mapbox/mbtiles-spec#readme) by specifying three different kinds of schema for `tiles` data: `flat`, `flat-with-hash`, and `normalized`. The `mbtiles` tool can convert between these schemas, and can also generate a diff between two files of any schemas, as well as merge multiple schema files into one file. | ||
|
||
## flat | ||
Flat schema is the closest to the original MBTiles specification. It stores all tiles in a single table. This schema is the most efficient when the tileset contains no duplicate tiles. | ||
|
||
```sql, ignore | ||
CREATE TABLE tiles ( | ||
zoom_level INTEGER, | ||
tile_column INTEGER, | ||
tile_row INTEGER, | ||
tile_data BLOB); | ||
CREATE UNIQUE INDEX tile_index on tiles ( | ||
zoom_level, tile_column, tile_row); | ||
``` | ||
|
||
## flat-with-hash | ||
Similar to the `flat` schema, but also includes a `tile_hash` column that contains a hash value of the `tile_data` column. Use this schema when the tileset has no duplicate tiles, but you still want to be able to validate the content of each tile individually. | ||
|
||
```sql, ignore | ||
CREATE TABLE tiles_with_hash ( | ||
zoom_level INTEGER NOT NULL, | ||
tile_column INTEGER NOT NULL, | ||
tile_row INTEGER NOT NULL, | ||
tile_data BLOB, | ||
tile_hash TEXT); | ||
CREATE UNIQUE INDEX tiles_with_hash_index on tiles_with_hash ( | ||
zoom_level, tile_column, tile_row); | ||
CREATE VIEW tiles AS | ||
SELECT zoom_level, tile_column, tile_row, tile_data | ||
FROM tiles_with_hash; | ||
``` | ||
|
||
## normalized | ||
Normalized schema is the most efficient when the tileset contains duplicate tiles. It stores all tile blobs in the `images` table, and stores the tile Z,X,Y coordinates in a `map` table. The `map` table contains a `tile_id` column that is a foreign key to the `images` table. The `tile_id` column is a hash of the `tile_data` column, making it possible to both validate each individual tile like in the `flat-with-hash` schema, and also to optimize storage by storing each unique tile only once. | ||
|
||
```sql, ignore | ||
CREATE TABLE map ( | ||
zoom_level INTEGER, | ||
tile_column INTEGER, | ||
tile_row INTEGER, | ||
tile_id TEXT); | ||
CREATE TABLE images ( | ||
tile_id TEXT, | ||
tile_data BLOB); | ||
CREATE UNIQUE INDEX map_index ON map ( | ||
zoom_level, tile_column, tile_row); | ||
CREATE UNIQUE INDEX images_id ON images ( | ||
tile_id); | ||
CREATE VIEW tiles AS | ||
SELECT | ||
map.zoom_level AS zoom_level, | ||
map.tile_column AS tile_column, | ||
map.tile_row AS tile_row, | ||
images.tile_data AS tile_data | ||
FROM | ||
map JOIN images | ||
ON images.tile_id = map.tile_id; | ||
``` | ||
|
||
Optionally, `.mbtiles` files with `normalized` schema can include a `tiles_with_hash` view. All `normalized` files created by the `mbtiles` tool will contain this view. | ||
|
||
```sql, ignore | ||
CREATE VIEW tiles_with_hash AS | ||
SELECT | ||
map.zoom_level AS zoom_level, | ||
map.tile_column AS tile_column, | ||
map.tile_row AS tile_row, | ||
images.tile_data AS tile_data, | ||
images.tile_id AS tile_hash | ||
FROM | ||
map JOIN images | ||
ON map.tile_id = images.tile_id; | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters