Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/features/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -355,6 +355,7 @@
- [PPL Rename Command](sql/ppl-rename-command.md)
- [PPL Rex and Regex Commands](sql/ppl-rex-and-regex-commands.md)
- [PPL Spath Command](sql/ppl-spath-command.md)
- [PPL Timechart Command](sql/ppl-timechart-command.md)
- [Security Lake Data Source](sql/security-lake-data-source.md)
- [SQL Error Handling](sql/sql-error-handling.md)
- [SQL Pagination](sql/sql-pagination.md)
Expand Down
174 changes: 174 additions & 0 deletions docs/features/sql/ppl-timechart-command.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,174 @@
# PPL Timechart Command

## Summary

The PPL `timechart` command creates time-series visualizations by aggregating data into time buckets. It supports various span intervals, rate-based aggregation functions (`per_second`, `per_minute`, `per_hour`, `per_day`), custom timestamp fields, and grouping by categorical fields. The command is essential for performance monitoring, log analysis, and trend visualization in OpenSearch.

## Details

### Architecture

```mermaid
graph TB
subgraph "PPL Query Processing"
A[PPL Query with timechart] --> B[Parser]
B --> C[Timechart AST Node]
C --> D[Query Rewriter]
D --> E{per_* function?}
E -->|Yes| F[Transform to sum + eval]
E -->|No| G[Standard aggregation]
F --> H[Chart Command Handler]
G --> H
H --> I[OpenSearch Aggregation Query]
end

subgraph "Span Processing"
J[Span Definition] --> K{Span Type}
K -->|Fixed| L[Static calculation]
K -->|Variable| M[Dynamic timestampdiff]
L --> N[Time Bucket Creation]
M --> N
end
```

### Data Flow

```mermaid
flowchart LR
A[Source Data] --> B[Time Bucketing]
B --> C[Aggregation]
C --> D{Has per_* function?}
D -->|Yes| E[Rate Normalization]
D -->|No| F[Direct Output]
E --> G[Result Set]
F --> G
```

### Components

| Component | Description |
|-----------|-------------|
| `Timechart` AST | Represents the timechart command in the abstract syntax tree |
| `Chart` Command Handler | Unified handler for both `chart` and `timechart` commands |
| `SpanUnit` | Enum for time span units (millisecond, second, minute, hour, day, week, month, quarter, year) |
| `IntervalUnit` | Enum for interval calculations including millisecond support |
| `PlanUtils` | Utility for span unit to interval unit conversion |

### Configuration

| Setting | Description | Default |
|---------|-------------|---------|
| `timefield` | Specifies the timestamp field to use for time bucketing | `@timestamp` |
| `span` | Time interval for bucketing (e.g., `1m`, `5m`, `1h`, `500ms`) | Required |

### Syntax

```
timechart [timefield=<field>] span=<interval> <aggregation>... [by <field>]
```

#### Parameters

| Parameter | Required | Description |
|-----------|----------|-------------|
| `timefield` | No | Custom timestamp field name (default: `@timestamp`) |
| `span` | Yes | Time bucket interval |
| `aggregation` | Yes | One or more aggregation functions |
| `by` | No | Field to split results by |

#### Supported Span Units

| Unit | Abbreviation | Example |
|------|--------------|---------|
| Millisecond | `ms` | `span=500ms` |
| Second | `s` | `span=30s` |
| Minute | `m` | `span=5m` |
| Hour | `h` | `span=1h` |
| Day | `d` | `span=1d` |
| Week | `w` | `span=1w` |
| Month | `mon` | `span=1mon` |
| Quarter | `q` | `span=1q` |
| Year | `y` | `span=1y` |

### Rate-Based Aggregation Functions

| Function | Description | Calculation |
|----------|-------------|-------------|
| `per_second(field)` | Per-second rate | `sum(field) / span_in_seconds` |
| `per_minute(field)` | Per-minute rate | `sum(field) / span_in_seconds × 60` |
| `per_hour(field)` | Per-hour rate | `sum(field) / span_in_seconds × 3600` |
| `per_day(field)` | Per-day rate | `sum(field) / span_in_seconds × 86400` |

### Usage Examples

#### Basic time-series aggregation

```
source=web_logs
| timechart span=1h count() as requests
```

#### Rate calculation with per_second

```
source=network_logs
| timechart span=5m per_second(bytes) as bytes_per_second
```

#### Multiple aggregations

```
source=metrics
| timechart span=1m avg(cpu_usage), max(memory_usage), per_second(requests)
```

#### Custom timestamp field

```
source=custom_events
| timechart timefield=event_timestamp span=1h count() by event_type
```

#### Millisecond precision for high-frequency data

```
source=trading_data
| timechart span=100ms per_second(transactions)
```

#### Grouping by category

```
source=application_logs
| timechart span=15m count() by log_level
```

## Limitations

- `per_*` functions work exclusively with the `timechart` command
- Variable-length spans (month/quarter/year) use dynamic calculation which may have slight performance overhead
- The `timechart` command requires a timestamp field in the source data

## Related PRs

| Version | PR | Description |
|---------|-----|-------------|
| v3.4.0 | [#4464](https://github.com/opensearch-project/sql/pull/4464) | Add `per_second` function support |
| v3.4.0 | [#4531](https://github.com/opensearch-project/sql/pull/4531) | Add `per_minute`, `per_hour`, `per_day` functions |
| v3.4.0 | [#4672](https://github.com/opensearch-project/sql/pull/4672) | Support millisecond span |
| v3.4.0 | [#4755](https://github.com/opensearch-project/sql/pull/4755) | Merge `timechart` and `chart` implementations |
| v3.4.0 | [#4784](https://github.com/opensearch-project/sql/pull/4784) | Add `timefield` option |

## References

- [Issue #4350](https://github.com/opensearch-project/sql/issues/4350): PPL `per_*` aggregation function support
- [Issue #4550](https://github.com/opensearch-project/sql/issues/4550): Millisecond span bug fix
- [Issue #4576](https://github.com/opensearch-project/sql/issues/4576): Custom timestamp field feature request
- [Issue #4581](https://github.com/opensearch-project/sql/issues/4581): Timechart bug fixes
- [Issue #4582](https://github.com/opensearch-project/sql/issues/4582): Timechart bug fixes
- [Issue #4632](https://github.com/opensearch-project/sql/issues/4632): Timechart bug fixes
- [PPL Commands Documentation](https://docs.opensearch.org/3.0/search-plugins/sql/ppl/functions/)

## Change History

- **v3.4.0** (2026-01): Added `per_second`, `per_minute`, `per_hour`, `per_day` functions; millisecond span support; `timefield` option; merged `timechart` and `chart` implementations
162 changes: 162 additions & 0 deletions docs/releases/v3.4.0/features/sql/ppl-timechart-functions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,162 @@
# PPL Timechart Functions

## Summary

OpenSearch v3.4.0 introduces significant enhancements to the PPL `timechart` command with new rate-based aggregation functions (`per_second`, `per_minute`, `per_hour`, `per_day`), millisecond span support, custom timestamp field specification via `timefield`, and improved internal architecture by merging `timechart` and `chart` implementations.

## Details

### What's New in v3.4.0

#### Rate-Based Aggregation Functions

The `per_*` functions calculate rate-based metrics by normalizing aggregated values to specific time units:

| Function | Description | Multiplier |
|----------|-------------|------------|
| `per_second(field)` | Per-second rate | sum / span_seconds |
| `per_minute(field)` | Per-minute rate | sum / span_seconds × 60 |
| `per_hour(field)` | Per-hour rate | sum / span_seconds × 3600 |
| `per_day(field)` | Per-day rate | sum / span_seconds × 86400 |

#### Implementation Approach

The `per_*` functions use an **Eval Transformation** approach that rewrites the query at compile time:

```
-- Original query
source=events | timechart span=5m per_second(packets)

-- Rewritten internally to
source=events | timechart span=5m sum(packets) as `per_second(packets)`
| eval `per_second(packets)` = `per_second(packets)` / 300
```

For variable-length spans (month/quarter/year), the implementation dynamically calculates bucket length:

```
-- For span=2mon, uses timestampdiff for accurate calculation
| eval `per_second(packets)` = `per_second(packets)` /
timestampdiff(SECOND, @timestamp, timestampadd(MONTH, 2, @timestamp))
```

#### Custom Timestamp Field (`timefield`)

Users can now specify a custom timestamp field instead of the implicit `@timestamp`:

```
source=events | timechart timefield=start_at span=1hour count() by category
```

This resolves the limitation where indexes with non-standard timestamp field names required workarounds like renaming fields.

#### Millisecond Span Support

Fixed the bug where millisecond spans were incorrectly converted to microseconds. The `IntervalUnit` enum now includes `MILLISECOND`, enabling accurate time bucketing for sub-second precision:

```
source=logs | timechart span=500ms count()
```

#### Architecture Improvements

The `timechart` and `chart` command implementations have been merged. Since `timechart` is semantically a subset of `chart` (with row-split fixed to the timestamp field), this consolidation:

- Reduces code duplication
- Fixes several existing bugs in `timechart`
- Simplifies maintenance

### Technical Changes

```mermaid
graph TB
subgraph "Query Processing"
A[PPL Query] --> B[Parser]
B --> C[Timechart AST]
C --> D{Has per_* function?}
D -->|Yes| E[Rewrite to sum + eval]
D -->|No| F[Standard aggregation]
E --> G[Execute Query]
F --> G
end

subgraph "Span Calculation"
H[Fixed Span] --> I[Static seconds calculation]
J[Variable Span] --> K[timestampdiff dynamic calculation]
end
```

#### New Configuration

| Setting | Description | Default |
|---------|-------------|---------|
| `timefield` | Custom timestamp field name | `@timestamp` |

### Usage Examples

#### Basic per_second calculation

```
source=network_logs
| timechart span=1m per_second(packets)
```

Result:
```
| @timestamp | per_second(packets) |
|---------------------|---------------------|
| 2025-09-08 10:00:00 | 2.0 |
```

#### Multiple rate functions

```
source=network_logs
| timechart span=1m per_second(packets), per_minute(packets), per_hour(packets)
```

#### Custom timestamp field

```
source=ocsf_events
| timechart timefield=event_time span=1h count() by category
```

#### Millisecond precision

```
source=high_frequency_logs
| timechart span=100ms per_second(requests)
```

### Migration Notes

- No breaking changes; existing `timechart` queries continue to work
- Users with custom timestamp fields can now use `timefield` instead of `rename` workaround
- Millisecond spans now work correctly without manual adjustments

## Limitations

- `per_*` functions work exclusively with the `timechart` command (due to implicit timestamp field dependency)
- Variable-length spans (month/quarter/year) require dynamic calculation which may have slight performance overhead

## Related PRs

| PR | Description |
|----|-------------|
| [#4464](https://github.com/opensearch-project/sql/pull/4464) | Add `per_second` function support for `timechart` command |
| [#4531](https://github.com/opensearch-project/sql/pull/4531) | Add `per_minute`, `per_hour`, `per_day` function support |
| [#4672](https://github.com/opensearch-project/sql/pull/4672) | Support millisecond span |
| [#4755](https://github.com/opensearch-project/sql/pull/4755) | Merge the implementation of `timechart` and `chart` |
| [#4784](https://github.com/opensearch-project/sql/pull/4784) | Specify timestamp field with `timefield` in timechart command |

## References

- [Issue #4350](https://github.com/opensearch-project/sql/issues/4350): PPL `per_*` aggregation function support
- [Issue #4550](https://github.com/opensearch-project/sql/issues/4550): Span millisecond incorrectly converted to microsecond
- [Issue #4576](https://github.com/opensearch-project/sql/issues/4576): timechart with option to specify timestamp column
- [PPL Commands Documentation](https://docs.opensearch.org/3.0/search-plugins/sql/ppl/functions/)

## Related Feature Report

- [Full feature documentation](../../../features/sql/ppl-timechart-command.md)
1 change: 1 addition & 0 deletions docs/releases/v3.4.0/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -134,6 +134,7 @@

### SQL

- [PPL Timechart Functions](features/sql/ppl-timechart-functions.md) - Rate-based aggregation functions (per_second, per_minute, per_hour, per_day), millisecond span support, custom timefield option, merged timechart/chart implementation
- [PPL Query Optimization](features/sql/ppl-query-optimization.md) - 33 enhancements including sort pushdown, aggregation optimization, distinct count approx, case-to-range queries, fillnull command, YAML explain format
- [SQL/PPL Bugfixes](features/sql/sql-ppl-bugfixes.md) - 48 bug fixes including memory exhaustion fix, race condition fix, rex nested capture groups, filter pushdown improvements, and CVE-2025-48924
- [SQL CI/Tests](features/sql/sql-ci-tests.md) - CI/CD improvements including Gradle 9.2.0, JDK 25, BWC test splitting, query timeouts, and maven snapshots publishing
Expand Down