Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
28 changes: 26 additions & 2 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -19,10 +19,12 @@ concurrency:

env:
CARGO_TERM_COLOR: always
INCAN_REF: release/v0.3
EXPECTED_INCAN_VERSION: 0.3.0-rc42
INCAN_REF: feature/750-primitive-type-tokens
EXPECTED_INCAN_VERSION: 0.3.0-rc47
RUST_BACKTRACE: 1
INCAN_NO_BANNER: 1
INCAN_GENERATED_CARGO_TARGET_DIR: ${{ github.workspace }}/.incan-generated-cargo-target
INCAN_TEST_SHARED_TARGET_DIR: ${{ github.workspace }}/.incan-generated-cargo-target

jobs:
inql:
Expand Down Expand Up @@ -52,6 +54,28 @@ jobs:
with:
workspaces: incan -> target

- name: Cache generated InQL Cargo artifacts
uses: actions/cache@v4
with:
path: .incan-generated-cargo-target
key: inql-generated-cargo-${{ runner.os }}-incan-${{ env.EXPECTED_INCAN_VERSION }}-${{ hashFiles('incan.lock', 'incan.toml') }}
restore-keys: |
inql-generated-cargo-${{ runner.os }}-incan-${{ env.EXPECTED_INCAN_VERSION }}-

- name: Cache generated InQL Rust metadata
uses: actions/cache@v4
with:
path: |
target/incan_lock/rust_inspect/**/Cargo.toml
target/incan_lock/rust_inspect/**/Cargo.lock
target/incan_lock/rust_inspect/**/src/main.rs
target/incan_lock/rust_inspect/**/.incan_rust_inspect_cache.json
target/incan_lock/rust_inspect/**/.incan_rust_inspect_fingerprint
key: inql-rust-inspect-${{ runner.os }}-incan-${{ env.EXPECTED_INCAN_VERSION }}-${{ hashFiles('incan.lock', 'incan.toml') }}-${{ hashFiles('src/**/*.incn', 'tests/**/*.incn', 'scripts/**/*.incn') }}
restore-keys: |
inql-rust-inspect-${{ runner.os }}-incan-${{ env.EXPECTED_INCAN_VERSION }}-${{ hashFiles('incan.lock', 'incan.toml') }}-
inql-rust-inspect-${{ runner.os }}-incan-${{ env.EXPECTED_INCAN_VERSION }}-

- name: Build Incan compiler
working-directory: incan
run: cargo build --locked --bin incan
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@
- **Carriers that know their row type** — `DataFrame[T]`, `LazyFrame[T]`, and `DataStream[T]` share a `DataSet[T]` surface; bounded vs unbounded is reflected in the type hierarchy so unsafe streaming operations can be rejected at compile time.
- **SQL-familiar `query { }` blocks** — Clause-oriented relational syntax, typed against the current query schema, aligned with the same resolution rules as method chains.
- **One naming model** — `.column`, `alias.column`, bare names in the query schema, and ordinary Incan bindings are specified so blocks, chains, and future surfaces stay equivalent where it counts.
- **A registry-backed function catalog** — Core operators, aggregates, common scalar functions, window helpers, generators, nested-data helpers, and compatibility aliases share one checked helper model and carry portable metadata for Substrait and backend adapters.
- **A registry-backed function catalog** — Core operators, aggregates, common scalar functions, window helpers, generators, nested-data helpers, and compatibility aliases share one checked helper model and carry portable metadata for Substrait and backend adapters. Typed helpers accept primitives where that is the natural authoring shape, such as `add(col("amount"), 1)`, `substring(col("sku"), 1, 3)`, or `cast(col("amount_text"), float)`, while query-schema validation checks referenced column types during planning and lowering.
- **Portable logical plans** — Substrait is the normative interchange; read roots stay logical while binding and execution stay in the session layer (see RFCs 002 and 004).

Design is **RFC-driven**; **[docs/rfcs/](docs/rfcs/README.md)** is the source of truth.
Expand Down
12 changes: 6 additions & 6 deletions docs/language/explanation/dataset_carriers.md
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,7 @@ from pub::inql.functions import col, gt, lit
from models import Order

def high_value_orders(orders: LazyFrame[Order]) -> LazyFrame[Order]:
return orders.filter(gt(col("amount"), lit(100)))
return orders.filter(gt(col("amount"), 100))
```

### `DataStream[T]` — streaming
Expand All @@ -69,11 +69,11 @@ Use `DataStream[T]` for streaming/unbounded data:

```incan
from pub::inql import DataStream
from pub::inql.functions import col, eq, lit
from pub::inql.functions import col, eq
from models import Event

def important_events(events: DataStream[Event]) -> DataStream[Event]:
return events.filter(eq(col("severity"), lit("critical")))
return events.filter(eq(col("severity"), "critical"))
```

`DataStream[T]` shares the same operation API as batch carriers, but signals that its source is unbounded. Static streaming constraints are specified in RFC 001 and enforced as the compiler gains analysis for `UnboundedDataSet[T]`.
Expand Down Expand Up @@ -155,8 +155,8 @@ from models import Order
def enrich_orders(orders: LazyFrame[Order]) -> LazyFrame[Order]:
return (
orders
.with_column("amount_x2", mul(col("amount"), lit(2)))
.with_column("amount_plus_one", add(col("amount"), lit(1)))
.with_column("amount_x2", mul(col("amount"), 2))
.with_column("amount_plus_one", add(col("amount"), 1))
)
```

Expand Down Expand Up @@ -184,7 +184,7 @@ from models import Order
def summarize_orders(orders: LazyFrame[Order]) -> LazyFrame[Order]:
grouped = (
orders
.with_column("amount_plus_one", add(col("amount"), lit(1)))
.with_column("amount_plus_one", add(col("amount"), 1))
.group_by([col("customer_id")])
.agg([sum(col("amount")), count()])
)
Expand Down
4 changes: 2 additions & 2 deletions docs/language/explanation/execution_context.md
Original file line number Diff line number Diff line change
Expand Up @@ -80,8 +80,8 @@ from models import Order
session = Session.default()

orders: LazyFrame[Order] = session.read_csv("orders", "orders.csv")?
enriched = orders.with_column("amount_x2", mul(col("amount"), lit(2)))
filtered = enriched.filter(gt(col("amount"), lit(100))).limit(10)
enriched = orders.with_column("amount_x2", mul(col("amount"), 2))
filtered = enriched.filter(gt(col("amount"), 100)).limit(10)

session.activate()
preview = filtered.collect()?
Expand Down
2 changes: 1 addition & 1 deletion docs/language/reference/builders/aggregates.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ Aggregate measures support method-style modifiers:
from pub::inql.functions import add, approx_count_distinct, approx_percentile, avg, col, count, count_distinct, count_if, eq, lit, max, min, str_lit, sum

grouped = orders.group_by([col("customer_id")]).agg([
sum(add(col("amount"), lit(5))),
sum(add(col("amount"), 5)),
count(),
count(col("discount_code")),
count_distinct(col("product_id")),
Expand Down
28 changes: 14 additions & 14 deletions docs/language/reference/builders/filters.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,31 +4,31 @@ Current filter authoring uses the shared scalar-expression builder model.

## Functions

| Builder | Signature | Meaning |
| -------------- | ----------------------------------------------------------- | ---------------------------------------------------------------------- |
| `always_true` | `def always_true() -> ColumnExpr` | Trivial boolean scalar expression; canonical rewrite can eliminate it. |
| `always_false` | `def always_false() -> ColumnExpr` | Boolean scalar expression that rejects every row. |
| `eq` | `def eq(left: ColumnExpr, right: ColumnExpr) -> ColumnExpr` | Equality predicate scalar expression. |
| `gt` | `def gt(left: ColumnExpr, right: ColumnExpr) -> ColumnExpr` | Greater-than predicate scalar expression. |
| `lit` | `def lit(value: int \| float \| str \| bool) -> ColumnExpr` | Canonical scalar literal helper. |
| `int_lit` | `def int_lit(value: int) -> ColumnExpr` | Typed integer literal helper. |
| `str_lit` | `def str_lit(value: str) -> ColumnExpr` | Typed string literal helper. |
| `bool_lit` | `def bool_lit(value: bool) -> ColumnExpr` | Typed boolean literal helper. |
| Builder | Signature | Meaning |
| -------------- | ------------------------------------------------------------------------- | ---------------------------------------------------------------------- |
| `always_true` | `def always_true() -> BoolLiteralExpr` | Trivial boolean scalar expression; canonical rewrite can eliminate it. |
| `always_false` | `def always_false() -> BoolLiteralExpr` | Boolean scalar expression that rejects every row. |
| `eq` | `def eq(left: ScalarValueOrColumn, right: ScalarValueOrColumn) -> BoolColumnExpr` | Equality predicate scalar expression. |
| `gt` | `def gt(left: ScalarValueOrColumn, right: ScalarValueOrColumn) -> BoolColumnExpr` | Greater-than predicate scalar expression. |
| `lit` | `def lit(value: int \| float \| str \| bool) -> ColumnExpr` | Canonical scalar literal helper. |
| `int_lit` | `def int_lit(value: int) -> IntLiteralExpr` | Typed integer literal helper. |
| `str_lit` | `def str_lit(value: str) -> StringLiteralExpr` | Typed string literal helper. |
| `bool_lit` | `def bool_lit(value: bool) -> BoolLiteralExpr` | Typed boolean literal helper. |

## Example

```incan
from pub::inql.functions import col, eq, gt, lit
from pub::inql.functions import col, eq, gt

filtered = (
orders
.filter(gt(col("amount"), lit(100)))
.filter(eq(col("status"), lit("open")))
.filter(gt(col("amount"), 100))
.filter(eq(col("status"), "open"))
)
```

## Notes

- Filter predicates are scalar expressions, not a separate predicate-only builder hierarchy.
- The typed `*_lit(...)` helpers construct the same scalar-literal representation as `lit(...)`.
- Primitive values are accepted where predicate helper signatures use value-or-column aliases. Use `lit(...)` or typed literal helpers when a broad `ColumnExpr` is required explicitly.
- Boolean composition belongs to the broader scalar-function surface.
32 changes: 16 additions & 16 deletions docs/language/reference/builders/projections.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,18 +4,18 @@ Projection builders are the current semantic target for scalar expressions in co

## Functions

| Builder | Signature | Meaning |
| ------------ | ------------------------------------------------------------ | --------------------------- |
| `col` | `def col(name: str) -> ColumnExpr` | Named column reference. |
| `lit` | `def lit(value: int \| float \| str \| bool) -> ColumnExpr` | Canonical scalar literal. |
| `int_expr` | `def int_expr(value: int) -> ColumnExpr` | Integer literal expression. |
| `float_expr` | `def float_expr(value: float) -> ColumnExpr` | Float literal expression. |
| `str_expr` | `def str_expr(value: str) -> ColumnExpr` | String literal expression. |
| `bool_expr` | `def bool_expr(value: bool) -> ColumnExpr` | Boolean literal expression. |
| `add` | `def add(left: ColumnExpr, right: ColumnExpr) -> ColumnExpr` | Binary addition. |
| `mul` | `def mul(left: ColumnExpr, right: ColumnExpr) -> ColumnExpr` | Binary multiplication. |
| `eq` | `def eq(left: ColumnExpr, right: ColumnExpr) -> ColumnExpr` | Equality predicate. |
| `gt` | `def gt(left: ColumnExpr, right: ColumnExpr) -> ColumnExpr` | Greater-than predicate. |
| Builder | Signature | Meaning |
| ------------ | ------------------------------------------------------------------------ | --------------------------- |
| `col` | `def col(name: str) -> ColumnRefExpr` | Named column reference. |
| `lit` | `def lit(value: int \| float \| str \| bool) -> ColumnExpr` | Canonical scalar literal. |
| `int_expr` | `def int_expr(value: int) -> IntLiteralExpr` | Integer literal expression. |
| `float_expr` | `def float_expr(value: float) -> FloatLiteralExpr` | Float literal expression. |
| `str_expr` | `def str_expr(value: str) -> StringLiteralExpr` | String literal expression. |
| `bool_expr` | `def bool_expr(value: bool) -> BoolLiteralExpr` | Boolean literal expression. |
| `add` | `def add(left: NumberValueOrColumn, right: NumberValueOrColumn) -> NumberColumnExpr` | Binary addition. |
| `mul` | `def mul(left: NumberValueOrColumn, right: NumberValueOrColumn) -> NumberColumnExpr` | Binary multiplication. |
| `eq` | `def eq(left: ScalarValueOrColumn, right: ScalarValueOrColumn) -> BoolColumnExpr` | Equality predicate. |
| `gt` | `def gt(left: ScalarValueOrColumn, right: ScalarValueOrColumn) -> BoolColumnExpr` | Greater-than predicate. |

## Dataset entrypoint

Expand All @@ -29,17 +29,17 @@ def with_column(self, name: str, expr: ColumnExpr) -> Self
## Example

```incan
from pub::inql.functions import add, col, lit, mul
from pub::inql.functions import add, col, mul

projected = (
orders
.with_column("amount_x2", mul(col("amount"), lit(2)))
.with_column("amount_plus_one", add(col("amount"), lit(1)))
.with_column("amount_x2", mul(col("amount"), 2))
.with_column("amount_plus_one", add(col("amount"), 1))
)
```

## Capability notes

- `with_column(...)` is the explicit computed-column entrypoint.
- Projection-list selection, query-block projection sugar, and alias-free symbolic surfaces lower to this scalar-expression model when exposed.
- The typed literal helpers construct the same scalar-literal representation as `lit(...)`.
- Numeric, string, and boolean helpers accept primitive values where their public signatures use value-or-column aliases. Use `lit(...)` for broad scalar-expression positions that specifically require a `ColumnExpr`.
6 changes: 3 additions & 3 deletions docs/language/reference/dataset_methods.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,14 +53,14 @@ def with_column(self, name: str, expr: ColumnExpr) -> Self

```incan
from pub::inql import LazyFrame
from pub::inql.functions import add, col, lit, mul
from pub::inql.functions import add, col, mul
from models import Order

def enrich(orders: LazyFrame[Order]) -> LazyFrame[Order]:
return (
orders
.with_column("amount_x2", mul(col("amount"), lit(2)))
.with_column("amount_plus_one", add(col("amount"), lit(1)))
.with_column("amount_x2", mul(col("amount"), 2))
.with_column("amount_plus_one", add(col("amount"), 1))
)
```

Expand Down
Loading
Loading