|
| 1 | +# blobview |
| 2 | +> Generate BigQuery SQL views from JSON |
| 3 | +
|
| 4 | +[](https://www.npmjs.com/package/@nealwp/blobview) |
| 5 | + |
| 6 | + |
| 7 | + |
| 8 | +This CLI tool reads a JSON file and produces BigQuery compatible SQL views from the given JSON. It assumes you have a JSON blob column in a BigQuery dataset that you will reference with the created view. |
| 9 | + |
| 10 | +## Usage |
| 11 | + |
| 12 | +```bash |
| 13 | +npx @nealwp/blobview <filepath> |
| 14 | +``` |
| 15 | + |
| 16 | +## Examples: |
| 17 | +Default output to STDOUT: |
| 18 | +```bash |
| 19 | +npx @nealwp/blobview ./path/to/file.json |
| 20 | +``` |
| 21 | + |
| 22 | +Redirect output to file: |
| 23 | +```bash |
| 24 | +npx @nealwp/blobview ./path/to/file.json > my-view-file.sql |
| 25 | +``` |
| 26 | + |
| 27 | +## Features |
| 28 | +* Produces valid BigQuery syntax SQL |
| 29 | +* Creates separate query for each nested object |
| 30 | +* Detects STRING, DECIMAL, and INTEGER types from JSON data and casts column datatypes accordingly |
| 31 | +* Detects GeoJSON Feature Collection type and formats to JSON string |
| 32 | +* Auto-formats column names to snake_case from camelCase and PascalCase |
| 33 | +* Detects deeply-nested objects and formats to JSON string |
| 34 | +* Pre-populates FROM clause with BigQuery-style placeholders |
| 35 | + |
| 36 | +## Limitations |
| 37 | +* Does not detect DATE or TIMESTAMP types, or other types like BOOLEAN |
| 38 | +* Arrays will get formatted as JSON strings |
| 39 | +* Assumes the BigQuery source dataset column name is always `json_blob` |
| 40 | +* Does not create SQL views in any syntax other than BigQuery |
| 41 | +* Requires a local JSON file to read |
| 42 | +* Does not include option to write queries to separate files instead of STDOUT |
| 43 | +* BigQuery project, dataset, and datastream names cannot be supplied as input |
| 44 | + |
| 45 | +## Example Output |
| 46 | + |
| 47 | +```jsonc |
| 48 | +// sample-data.json |
| 49 | +{ |
| 50 | + "stringField": "any string goes here", |
| 51 | + "integerField": 1234, |
| 52 | + "decimalField": 3.1234, |
| 53 | + "childField1": { |
| 54 | + "gender": "male", |
| 55 | + "latitude": 32.667598 |
| 56 | + }, |
| 57 | + "childField2": { |
| 58 | + "favoriteFruit": "banana", |
| 59 | + "longitude": 4.219472 |
| 60 | + }, |
| 61 | + "childWithNestedObject": { |
| 62 | + "isNormal": "i sure hope so", |
| 63 | + "nestedObject": { |
| 64 | + "favoritePhilosopher": "Kant", |
| 65 | + "shoeSize": 14.5 |
| 66 | + }, |
| 67 | + }, |
| 68 | + "examplGeoJson": { |
| 69 | + "type": "FeatureCollection", |
| 70 | + "features": [ |
| 71 | + { |
| 72 | + "type": "Feature", |
| 73 | + "id": 16, |
| 74 | + "geometry": { |
| 75 | + "type": "Polygon", |
| 76 | + "coordinates": [ |
| 77 | + [ |
| 78 | + [-32.667598, -4.239272], |
| 79 | + [-32.767999, -4.319272] |
| 80 | + ] |
| 81 | + ] |
| 82 | + } |
| 83 | + } |
| 84 | + ] |
| 85 | + } |
| 86 | +} |
| 87 | +``` |
| 88 | + |
| 89 | +```bash |
| 90 | +# terminal command |
| 91 | +npx @nealwp/blobview sample-data.json |
| 92 | +``` |
| 93 | + |
| 94 | +The above command will produce the following output: |
| 95 | + |
| 96 | +```sql |
| 97 | +/* stdout */ |
| 98 | +SELECT |
| 99 | + CAST(JSON_VALUE(json_blob.stringField) as STRING) as string_field |
| 100 | + , CAST(JSON_VALUE(json_blob.integerField) as INTEGER) as integer_field |
| 101 | + , CAST(JSON_VALUE(json_blob.decimalField) as DECIMAL) as decimal_field |
| 102 | + , TO_JSON_STRING(json_blob.exampleGeoJson) as example_geo_json |
| 103 | +FROM <project>.<datastream>.<dataset> |
| 104 | +-------- |
| 105 | +SELECT |
| 106 | + CAST(JSON_VALUE(json_blob.childField1.gender) as STRING) as gender |
| 107 | + , CAST(JSON_VALUE(json_blob.childField1.latitude) as DECIMAL) as latitude |
| 108 | +FROM <project>.<datastream>.<dataset> |
| 109 | +-------- |
| 110 | +SELECT |
| 111 | + CAST(JSON_VALUE(json_blob.childField2.favoriteFruit) as STRING) as favorite_fruit |
| 112 | + , CAST(JSON_VALUE(json_blob.childField2.longitude) as DECIMAL) as longitude |
| 113 | +FROM <project>.<datastream>.<dataset> |
| 114 | +-------- |
| 115 | +SELECT |
| 116 | + CAST(JSON_VALUE(json_blob.childWithNestedObject.isNormal) as STRING) as is_normal |
| 117 | + , TO_JSON_STRING(json_blob.childWithNestedObject.nestedObject) as nested_object |
| 118 | +FROM <project>.<datastream>.<dataset> |
| 119 | +``` |
0 commit comments