Skip to content

[META]PPL new trendline command #3011

Closed
@YANG-DB

Description

@YANG-DB

Is your feature request related to a problem?

Adding a new PPL trendline command to support computing a moving averages of fields.

We would like to support two flavours of moving average:

SMA : Simple moving average

  • f[i]: The value of field 'f' in the i-th data-point
  • n: The number of data-points in the moving window (period)
  • t: The current time index

SMA(t) = (1/n) * Σ(f[i]), where i = t-n+1 to t


WMA : Weighted moving average

WMA(t) = Σ(w[i] * f[i]) / Σ(w[i]), where i = t-n+1 to t
Where w[i] is the weight for the i-th data-point.

In a typical WMA, the weights are linearly decreasing from the most recent to the oldest data-point:
w[i] = n - (t - i), where i = t-n+1 to t

The complete forumlation would be:
WMA(t) = Σ((n - (t - i)) * f[i]) / Σ(n - (t - i)), where i = t-n+1 to t


Example

The next command shows a trendline over a 5 month period events by month

source=t | stats count(date_month) | trendline sma(5, count) AS trend | fields  trend

The next command would compute a 5-point simple moving average of the 'cpu_usage' field and store it in a new field called 'smooth_cpu'.

source=t| trendline sma(5,cpu_usage) as smooth_cpu

Multiple trendlines could be calculated in a single command, such as

| trendline sma(10,memory) as mem_trend wma(5,network_traffic) as net_trend.

Support for PPL trendline functionality is required for both:

- OpenSearch based PPL engine

- Spark based PPL engine

Do you have any additional context?
Add any other context or screenshots about the feature request here.

Metadata

Metadata

Assignees

No one assigned

    Labels

    PPLPiped processing languageenhancementNew feature or request

    Type

    No type

    Projects

    Status

    Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions