Skip to content

[BUG] distinct_count_approx does not work with array types #4533

@ahkcs

Description

@ahkcs

Query Information

PPL Query:

source = opensearch_dashboards_sample_data_ecommerce
| stats distinct_count_approx(`manufacturer`) as dc

Expected Result:
distinct_count_approx should return an approximate distinct count of the values in the manufacturer field, regardless of whether the field is a single value or an array.

Actual Result:
The query fails when the target field is of array type:

{
  "error": {
    "reason": "There was internal problem at backend",
    "details": "java.sql.SQLException: exception while executing query: class java.util.ArrayList cannot be cast to class java.lang.String (java.util.ArrayList and java.lang.String are in module java.base of loader 'bootstrap')",
    "type": "RuntimeException"
  },
  "status": 500
}

Steps to Reproduce:

  1. Create an index with an array-type field (e.g., manufacturer as ["A", "B"]).

  2. Run:

    source = your_index | stats distinct_count_approx(`manufacturer`)
    
  3. Observe that the query fails with a 500 Internal Server Error.

Metadata

Metadata

Assignees

Labels

PPLPiped processing languagebugSomething isn't working

Type

No type

Projects

Status

Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions