Generate BigQuery SQL views from JSON
This CLI tool reads a JSON file and produces BigQuery compatible SQL views from the given JSON. It assumes you have a JSON blob column in a BigQuery dataset that you will reference with the created view.
npx @nealwp/blobview@latest [options] <filepath>
Arguments:
filepath path to valid JSON file
Options:
-t TABLE, --table=TABLE specify a table name to use in FROM clause. default: "<table>"
-d DATASET, --dataset=DATASET specify a dataset to use in FROM clause. default: "<dataset>"
-h, --help show help
Default output to STDOUT:
npx @nealwp/blobview ./path/to/file.json
Dataset and table as input options:
npx @nealwp/blobview --dataset=myDataset --table=myTable ./path/to/file.json
# shorthand options
npx @nealwp/blobview -d myDataset -t myTable ./path/to/file.json
Redirect output to file:
npx @nealwp/blobview@latest ./path/to/file.json > my-view-file.sql
- Produces valid BigQuery syntax SQL
- Creates separate query for each nested object
- Detects STRING, DECIMAL, and INTEGER types from JSON data and casts column datatypes accordingly
- Detects GeoJSON Feature Collection type and formats to JSON string
- Auto-formats column names to snake_case from camelCase and PascalCase
- Detects deeply-nested objects and formats to JSON string
- Pre-populates FROM clause with BigQuery-style placeholders
- BigQuery dataset and table name can be supplied as input options
- Does not detect DATE or TIMESTAMP types, or other types like BOOLEAN
- Arrays will get formatted as JSON strings
- Assumes the BigQuery source dataset column name is always
json_blob
- Does not create SQL views in any syntax other than BigQuery
- Requires a local JSON file to read
- Does not include option to write queries to separate files instead of STDOUT
- BigQuery project name cannot be supplied as input
# terminal command
npx @nealwp/blobview sample-data.json
The above command will produce the following output:
/* stdout */
SELECT
CAST(JSON_VALUE(json_blob.stringField) as STRING) as string_field
, CAST(JSON_VALUE(json_blob.integerField) as INTEGER) as integer_field
, CAST(JSON_VALUE(json_blob.decimalField) as DECIMAL) as decimal_field
, TO_JSON_STRING(json_blob.exampleGeoJson) as example_geo_json
FROM <project>.<dataset>.<table>
/**/
SELECT
CAST(JSON_VALUE(json_blob.childField1.gender) as STRING) as gender
, CAST(JSON_VALUE(json_blob.childField1.latitude) as DECIMAL) as latitude
FROM <project>.<dataset>.<table>
/**/
SELECT
CAST(JSON_VALUE(json_blob.childField2.favoriteFruit) as STRING) as favorite_fruit
, CAST(JSON_VALUE(json_blob.childField2.longitude) as DECIMAL) as longitude
FROM <project>.<dataset>.<table>
/**/
SELECT
CAST(JSON_VALUE(json_blob.childWithNestedObject.isNormal) as STRING) as is_normal
, TO_JSON_STRING(json_blob.childWithNestedObject.nestedObject) as nested_object
FROM <project>.<dataset>.<table>
# terminal command with input options
npx @nealwp/blobview --dataset=myDataset --table=myTable sample-data.json
Will produce the following output:
/* stdout */
SELECT
CAST(JSON_VALUE(json_blob.stringField) as STRING) as string_field
, CAST(JSON_VALUE(json_blob.integerField) as INTEGER) as integer_field
, CAST(JSON_VALUE(json_blob.decimalField) as DECIMAL) as decimal_field
, TO_JSON_STRING(json_blob.exampleGeoJson) as example_geo_json
FROM <project>.myDataset.myTable
/**/
SELECT
CAST(JSON_VALUE(json_blob.childField1.gender) as STRING) as gender
, CAST(JSON_VALUE(json_blob.childField1.latitude) as DECIMAL) as latitude
FROM <project>.myDataset.myTable
/**/
SELECT
CAST(JSON_VALUE(json_blob.childField2.favoriteFruit) as STRING) as favorite_fruit
, CAST(JSON_VALUE(json_blob.childField2.longitude) as DECIMAL) as longitude
FROM <project>.myDataset.myTable
/**/
SELECT
CAST(JSON_VALUE(json_blob.childWithNestedObject.isNormal) as STRING) as is_normal
, TO_JSON_STRING(json_blob.childWithNestedObject.nestedObject) as nested_object
FROM <project>.myDataset.myTable