diff --git a/documentation/reference/sql/pivot.md b/documentation/reference/sql/pivot.md new file mode 100644 index 00000000..60c3993c --- /dev/null +++ b/documentation/reference/sql/pivot.md @@ -0,0 +1,249 @@ +--- +title: PIVOT keyword +sidebar_label: PIVOT +description: PIVOT SQL keyword reference documentation. +--- + +`PIVOT` allows you to pivot rows into a columns. This can be useful when you want to ingest narrow-schema data, +and then pivot it into a wide-schema. + +This syntax is supported within `SELECT` queries. + +## Syntax + +![Flow chart showing the syntax of the PIVOT keyword](/images/docs/diagrams/pivot.svg) + +## Mechanics + +The `PIVOT` keyword comes after a general table select. + +There are two components: + +#### Aggregate Columns + +These columns appear immediately after the `PIVOT` keyword. These are aggregates that will be +calculated for each of the Pivot columns. These are the values that will be placed in the output columns. + +#### Pivot Columns + +These columns appear after the `FOR` keyword, and define the filtering and final column names. The aggregate +functions will be run for each of these. + +### Single aggregate and pivot + +```questdb-sql title="basic PIVOT" demo +(trades LIMIT 1000) + PIVOT ( + avg(price) + FOR symbol IN ('BTC-USD', 'ETH-USD') + ); +``` + +| BTC-USD | ETH-USD | +| ------------------ | ----------------- | +| 39282.200736543906 | 2616.588454404948 | + + +This query calculates an average `price` based on filtering rows that contain the `symbol`s defined in the queries. + +In short, this shifts the `symbol` names into the column position, and fills it with the corresponding aggregate value. + +An equivalent non-pivot query might look like this: + +```questdb-sql title="basic PIVOT without PIVOT demo +SELECT + avg(CASE WHEN symbol = 'BTC-USD' THEN price END) AS 'BTC-USD', + avg(CASE WHEN symbol = 'ETH-USD' THEN price END) AS 'ETH-USD' +FROM trades +``` + +### Multiple aggregates, single pivot + +```questdb-sql title="multiple aggregates" demo +(trades LIMIT 1000) + PIVOT ( + avg(price), + count(price) + FOR symbol IN ('BTC-USD', 'ETH-USD') + ); +``` + +| BTC-USD_avg | BTC-USD_count | ETH-USD_avg | ETH-USD_count | +| ------------------ | ------------- | ----------------- | ------------- | +| 39282.200736543906 | 353 | 2616.588454404948 | 647 | + +In this case, the aggregate functions are applied to each of the filtered symbols, so the final output has $2 \times 2 = 4$ columns. + +### Single aggregate, multiple pivots + +```questdb-sql title="multiple pivots" demo +(trades LIMIT 1000) + PIVOT ( + avg(price) + FOR symbol IN ('BTC-USD', 'ETH-USD') + side IN ('buy', 'sell') + ); +``` + +| BTC-USD_buy | BTC-USD_sell | ETH-USD_buy | ETH-USD_sell | +| ------------------ | ----------------- | ----------------- | ----------------- | +| 39286.997461139894 | 39276.41468750003 | 2616.850413223139 | 2616.253626760561 | + +In this case, the aggregate function is applied to each of the symbols, combinatorially. + +Therefore, the output dataset is $1 x (2 x 2) = 4$ columns. + +### Multiple aggregates, multiple pivots + +```questdb-sql title="multiple aggregates and pivots" demo +(trades LIMIT 1000) + PIVOT ( + avg(price), + count(price) + FOR symbol IN ('BTC-USD', 'ETH-USD') + side IN ('buy', 'sell') + ); +``` + +| BTC-USD_buy_avg | BTC-USD_buy_count | BTC-USD_sell_avg | BTC-USD_sell_count | ETH-USD_buy_avg | ETH-USD_buy_count | ETH-USD_sell_avg | ETH-USD_sell_count | +| ------------------ | ----------------- | ----------------- | ------------------ | ----------------- | ----------------- | ----------------- | ------------------ | +| 39286.997461139894 | 193 | 39276.41468750003 | 160 | 2616.850413223139 | 363 | 2616.253626760561 | 284 | + +Each of the aggregates is applied to each combination of pivot columns. Therefore, the output column count is $2 x 2 x 2 = 8$. + +### Aliasing aggregate columns + +If you wish to control the column output name, or need to override it to avoid duplicate issues, you can set an alias. + +```questdb-sql title="aggregate with alias" demo +(trades LIMIT 1000) + PIVOT ( + avg(price) as average_price + FOR symbol IN ('BTC-USD', 'ETH-USD') + ); +``` + +| BTC-USD_average_price | ETH-USD_average_price | +| --------------------- | --------------------- | +| 39282.200736543906 | 2616.588454404948 | + +### With `GROUP BY` + +You can add an explicit group by to the PIVOT clause to modify the output result set. + +Consider this basic case, where we are just taking an average price: + +```questdb-sql title="pivot without explicit group by" demo +(trades LIMIT 1000) + PIVOT ( + avg(price) + FOR symbol IN ('BTC-USD') + ); +``` + +| BTC-USD | +| ------------------ | +| 39282.200736543906 | + + +Perhaps we actually want to run this for both `buy` and `sell` sides? In earlier examples, we demonstrated how +you can do this with multiple output columns: + +```questdb-sql title="multiple pivots without explicit group by" demo +(trades LIMIT 1000) + PIVOT ( + avg(price) + FOR symbol IN ('BTC-USD') + side IN ('buy', 'sell') + ); +``` + +| BTC-USD_buy | BTC-USD_sell | +| ------------------ | ----------------- | +| 39286.997461139894 | 39276.41468750003 | + +But perhaps we'd rather just have a `side` and `BTC-USD` column, with two rows in the output? + +```questdb-sql title="pivot with explicit group by" demo +(trades LIMIT 1000) + PIVOT ( + avg(price) + FOR symbol IN ('BTC-USD') + GROUP BY side + ); +``` + +| side | BTC-USD | +| ---- | ------------------ | +| buy | 39286.997461139894 | +| sell | 39276.41468750003 | + +You can imagine that the above query is equivalent to: + +```questdb-sql title="above without using pivot" demo +SELECT side, + avg(price) as 'BTC-USD' + FROM (trades LIMIT 1000) + WHERE symbol = 'BTC-USD' + GROUP BY side, symbol +``` + +This then scales up as you add more clauses to the `PIVOT`: + +```questdb-sql title="explicit group by and multiple clauses" demo +(trades LIMIT 1000) + PIVOT ( + avg(price), + count(price) + FOR symbol IN ('BTC-USD', 'ETH-USD') + GROUP BY side + ); +``` + +| side | BTC-USD_avg | BTC-USD_count | ETH-USD_avg | ETH-USD_count | +| ---- | ------------------ | ------------- | ----------------- | ------------- | +| sell | 39276.41468750003 | 160 | 2616.253626760561 | 284 | +| buy | 39286.997461139894 | 193 | 2616.850413223139 | 363 | + +### With `ORDER BY` + +We can add an `ORDER BY` clause to sort the final result set by a column. For example, if +we wanted to guarantee the ordering to be the `buy` row, then `sell`: + +```questdb-sql title="explicit group by and order by" demo +(trades LIMIT 1000) + PIVOT ( + avg(price), + count(price) + FOR symbol IN ('BTC-USD', 'ETH-USD') + GROUP BY side + ORDER BY side + ); +``` + +| side | BTC-USD_avg | BTC-USD_count | ETH-USD_avg | ETH-USD_count | +| ---- | ------------------ | ------------- | ----------------- | ------------- | +| buy | 39286.997461139894 | 193 | 2616.850413223139 | 363 | +| sell | 39276.41468750003 | 160 | 2616.253626760561 | 284 | + + +### With `LIMIT` + +Additionally, you can tag a `LIMIT` on the query. So we could take the above result set and select just the first row. + +```questdb-sql title="explicit group by and order by and limit" demo +(trades LIMIT 1000) + PIVOT ( + avg(price), + count(price) + FOR symbol IN ('BTC-USD', 'ETH-USD') + GROUP BY side + ORDER BY side + LIMIT 1 + ); +``` + +| side | BTC-USD_avg | BTC-USD_count | ETH-USD_avg | ETH-USD_count | +| ---- | ------------------ | ------------- | ----------------- | ------------- | +| buy | 39286.997461139894 | 193 | 2616.850413223139 | 363 | + diff --git a/documentation/reference/sql/unpivot.md b/documentation/reference/sql/unpivot.md new file mode 100644 index 00000000..bfc79089 --- /dev/null +++ b/documentation/reference/sql/unpivot.md @@ -0,0 +1,96 @@ +--- +title: UNPIVOT keyword +sidebar_label: UNPIVOT +description: UNPIVOT SQL keyword reference documentation. +--- + +`UNPIVOT` allows you to pivot columns into rows. This allows you to condense values from +multiple columns into a single column. + +This syntax is supported within `SELECT` queries. + +## Syntax + +![Flow chart showing the syntax of the UNPIVOT keyword](/images/docs/diagrams/unpivot.svg) + +## Mechanics + +The `UNPIVOT` keyword comes after a general table select. + +There are two components: + +#### Value Column + +This column will contain the values copied from the unpivot columns; + +#### Unpivot Columns + +This column appears after the `FOR` keyword, and contains the names of the columns from which values will be taken. + + +### PIVOT/UNPIVOT Round Trip + +One of the easier ways to demonstrate `UNPIVOT` is to combine it with `PIVOT`. + +Let's take the following `PIVOT` query: + +```questdb-sql title="basic PIVOT" demo +(trades LIMIT 1000) + PIVOT ( + avg(price) + FOR symbol IN ('BTC-USD', 'ETH-USD') + ); +``` + +| BTC-USD | ETH-USD | +| ------------------ | ----------------- | +| 39282.200736543906 | 2616.588454404948 | + +As you can see, the result set contains two columns, with names originating from the `symbol` column, +and values originating from the `price` column. + +This data is now column modelled. If we want to convert it back to row-modelled, we can +use `UNPIVOT`. + +```questdb-sql title="round trip" demo +( + (trades LIMIT 1000) + PIVOT ( + avg(price) + FOR symbol IN ('BTC-USD', 'ETH-USD') + ) +) UNPIVOT ( + avg_price + FOR symbol IN ('BTC-USD', 'ETH-USD') + ); +``` + +| symbol | avg_price | +| ------- |--------------------| +| BTC-USD | 39282.200736543906 | +| ETH-USD | 2616.588454404948 | + +In this `UNPIVOT` query, we transpose the column names `('BTC-USD', 'ETH-USD')` into +the `symbol` column, with their matching values in a + + +### Basic unpivot + +```questdb-sql title="basic UNPIVOT" demo +(select timestamp, + symbol, + side, + ask_px_00, + ask_px_01, + ask_px_02, + ask_px_03 + FROM AAPL_orderbook LIMIT 1000 + ) UNPIVOT ( + prices + FOR price_type IN ( + 'ask_px_00', + 'ask_px_01', + 'ask_px_02', + 'ask_px_03') + ) +``` diff --git a/documentation/sidebars.js b/documentation/sidebars.js index 9f7518df..696295e6 100644 --- a/documentation/sidebars.js +++ b/documentation/sidebars.js @@ -234,6 +234,7 @@ module.exports = { "reference/sql/latest-on", "reference/sql/limit", "reference/sql/order-by", + "reference/sql/pivot", "reference/sql/reindex", "reference/sql/rename", { @@ -257,6 +258,7 @@ module.exports = { }, "reference/sql/truncate", "reference/sql/union-except-intersect", + "reference/sql/unpivot", "reference/sql/update", "reference/sql/vacuum-table", "reference/sql/where", diff --git a/static/images/docs/diagrams/.railroad b/static/images/docs/diagrams/.railroad index 4b118094..7939960e 100644 --- a/static/images/docs/diagrams/.railroad +++ b/static/images/docs/diagrams/.railroad @@ -358,3 +358,20 @@ disableDedup enableDedup ::= 'ALTER' 'TABLE' tableName 'DEDUP' 'ENABLE' 'UPSERT' 'KEYS' '(' (column ( ',' column )* ) ')' + + +pivot +::= ( 'SELECT' '*' 'FROM' )? tableNameExpr 'PIVOT' '(' aggregateExpr 'AS' aliasExpr + ( ',' aggregateExpr 'AS' aliasExpr )* 'FOR' + pivotColumn 'IN' '(' listExpr ')' ( pivotColumn 'IN' '(' listExpr ')' )* + ('GROUP' 'BY' groupByExpr ( ',' groupByExpr )* )? + ('ORDER' 'BY' orderByExpr ( ',' orderByExpr )* )? + ('LIMIT' limitExpr)? + ')' + +unpivot +::= ( 'SELECT' '*' 'FROM' )? tableNameExpr 'UNPIVOT' ( ( EXCLUDE | INCLUDE ) NULLS )? + '(' + valueColumnExpr 'FOR' + unpivotColumn 'IN' '(' listExpr ')' + ')' diff --git a/static/images/docs/diagrams/pivot.svg b/static/images/docs/diagrams/pivot.svg new file mode 100644 index 00000000..a48d7a49 --- /dev/null +++ b/static/images/docs/diagrams/pivot.svg @@ -0,0 +1,109 @@ + + + + + + + + + SELECT + + + * + + + FROM + + + tableNameExpr + + PIVOT + + + ( + + + aggregateExpr + + AS + + + aliasExpr + + , + + + FOR + + + pivotColumn + + IN + + + ( + + + listExpr + + ) + + + GROUP + + + BY + + + groupByExpr + + , + + + ORDER + + + BY + + + orderByExpr + + , + + + LIMIT + + + limitExpr + + ) + + + + \ No newline at end of file diff --git a/static/images/docs/diagrams/unpivot.svg b/static/images/docs/diagrams/unpivot.svg new file mode 100644 index 00000000..12d635ac --- /dev/null +++ b/static/images/docs/diagrams/unpivot.svg @@ -0,0 +1,82 @@ + + + + + + + + + SELECT + + + * + + + FROM + + + tableNameExpr + + UNPIVOT + + + EXCLUDE + + + INCLUDE + + + NULLS + + ( + + + valueColumnExpr + + FOR + + + unpivotColumn + + IN + + + ( + + + listExpr + + ) + + + ) + + + + \ No newline at end of file