Skip to content

Conversation

@raghuvanshraj
Copy link
Contributor

@raghuvanshraj raghuvanshraj commented Nov 13, 2025

Description

This PR adds logging for rust code in the parquet-data-format module so that it leverages OpenSearch loggers through JNI.

Related Issues

Resolves #[Issue number to be closed when this PR is merged]

Check List

  • Functionality includes testing.
  • API changes companion pull request created, if applicable.
  • Public documentation issue/PR created, if applicable.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

[2025-11-13T09:58:40,161][INFO ][c.p.p.b.RustLoggerBridge ] [842f575163fb] [RUST] get_filtered_writer_memory_usage called with prefix: /Users/raghraaj/workplace/raghuvanshraj/OpenSearch-Fork/build/distribution/local/opensearch-3.3.0-SNAPSHOT/data/nodes/0/indices/5K29c_dUS5-_UvVNosyZew/0
[2025-11-13T09:58:40,162][INFO ][c.p.p.b.RustLoggerBridge ] [842f575163fb] [RUST] Total memory usage across 0 filtered ArrowWriters (prefix: /Users/raghraaj/workplace/raghuvanshraj/OpenSearch-Fork/build/distribution/local/opensearch-3.3.0-SNAPSHOT/data/nodes/0/indices/5K29c_dUS5-_UvVNosyZew/0): 0 bytes
[2025-11-13T09:58:40,162][INFO ][c.p.p.e.ParquetExecutionEngine] [842f575163fb] Native memory used by VSR Buffer Pool: 0
[2025-11-13T09:58:40,163][INFO ][c.p.p.e.ParquetExecutionEngine] [842f575163fb] Native memory used by ArrowWriters in shard path /Users/raghraaj/workplace/raghuvanshraj/OpenSearch-Fork/build/distribution/local/opensearch-3.3.0-SNAPSHOT/data/nodes/0/indices/5K29c_dUS5-_UvVNosyZew/0: 0
[2025-11-13T09:58:40,164][INFO ][o.o.i.IndexingMemoryController] [842f575163fb] total indexing heap bytes used [0b] vs indices.memory.index_buffer_size [102.3mb], total native bytes used [0b] vs native buffer [4.6gb], currently writing bytes [0b]
[2025-11-13T09:58:45,168][INFO ][c.p.p.b.RustLoggerBridge ] [842f575163fb] [RUST] get_filtered_writer_memory_usage called with prefix: /Users/raghraaj/workplace/raghuvanshraj/OpenSearch-Fork/build/distribution/local/opensearch-3.3.0-SNAPSHOT/data/nodes/0/indices/5K29c_dUS5-_UvVNosyZew/0
[2025-11-13T09:58:45,169][INFO ][c.p.p.b.RustLoggerBridge ] [842f575163fb] [RUST] Total memory usage across 0 filtered ArrowWriters (prefix: /Users/raghraaj/workplace/raghuvanshraj/OpenSearch-Fork/build/distribution/local/opensearch-3.3.0-SNAPSHOT/data/nodes/0/indices/5K29c_dUS5-_UvVNosyZew/0): 0 bytes
[2025-11-13T09:58:45,169][INFO ][c.p.p.e.ParquetExecutionEngine] [842f575163fb] Native memory used by VSR Buffer Pool: 0
[2025-11-13T09:58:45,170][INFO ][c.p.p.e.ParquetExecutionEngine] [842f575163fb] Native memory used by ArrowWriters in shard path /Users/raghraaj/workplace/raghuvanshraj/OpenSearch-Fork/build/distribution/local/opensearch-3.3.0-SNAPSHOT/data/nodes/0/indices/5K29c_dUS5-_UvVNosyZew/0: 0
[2025-11-13T09:58:45,170][INFO ][o.o.i.IndexingMemoryController] [842f575163fb] total indexing heap bytes used [0b] vs indices.memory.index_buffer_size [102.3mb], total native bytes used [0b] vs native buffer [4.6gb], currently writing bytes [0b]
[2025-11-13T09:58:50,175][INFO ][c.p.p.b.RustLoggerBridge ] [842f575163fb] [RUST] get_filtered_writer_memory_usage called with prefix: /Users/raghraaj/workplace/raghuvanshraj/OpenSearch-Fork/build/distribution/local/opensearch-3.3.0-SNAPSHOT/data/nodes/0/indices/5K29c_dUS5-_UvVNosyZew/0
[2025-11-13T09:58:50,176][INFO ][c.p.p.b.RustLoggerBridge ] [842f575163fb] [RUST] Total memory usage across 0 filtered ArrowWriters (prefix: /Users/raghraaj/workplace/raghuvanshraj/OpenSearch-Fork/build/distribution/local/opensearch-3.3.0-SNAPSHOT/data/nodes/0/indices/5K29c_dUS5-_UvVNosyZew/0): 0 bytes
[2025-11-13T09:58:50,176][INFO ][c.p.p.e.ParquetExecutionEngine] [842f575163fb] Native memory used by VSR Buffer Pool: 0
[2025-11-13T09:58:50,176][INFO ][c.p.p.e.ParquetExecutionEngine] [842f575163fb] Native memory used by ArrowWriters in shard path /Users/raghraaj/workplace/raghuvanshraj/OpenSearch-Fork/build/distribution/local/opensearch-3.3.0-SNAPSHOT/data/nodes/0/indices/5K29c_dUS5-_UvVNosyZew/0: 0
[2025-11-13T09:58:50,176][INFO ][o.o.i.IndexingMemoryController] [842f575163fb] total indexing heap bytes used [0b] vs indices.memory.index_buffer_size [102.3mb], total native bytes used [0b] vs native buffer [4.6gb], currently writing bytes [0b]
|                                                         Metric |         Task |       Value |   Unit |
|---------------------------------------------------------------:|-------------:|------------:|-------:|
|                     Cumulative indexing time of primary shards |              |     246.923 |    min |
|             Min cumulative indexing time across primary shards |              |     246.923 |    min |
|          Median cumulative indexing time across primary shards |              |     246.923 |    min |
|             Max cumulative indexing time across primary shards |              |     246.923 |    min |
|            Cumulative indexing throttle time of primary shards |              |     8.43155 |    min |
|    Min cumulative indexing throttle time across primary shards |              |     8.43155 |    min |
| Median cumulative indexing throttle time across primary shards |              |     8.43155 |    min |
|    Max cumulative indexing throttle time across primary shards |              |     8.43155 |    min |
|                        Cumulative merge time of primary shards |              |           0 |    min |
|                       Cumulative merge count of primary shards |              |           0 |        |
|                Min cumulative merge time across primary shards |              |           0 |    min |
|             Median cumulative merge time across primary shards |              |           0 |    min |
|                Max cumulative merge time across primary shards |              |           0 |    min |
|               Cumulative merge throttle time of primary shards |              |           0 |    min |
|       Min cumulative merge throttle time across primary shards |              |           0 |    min |
|    Median cumulative merge throttle time across primary shards |              |           0 |    min |
|       Max cumulative merge throttle time across primary shards |              |           0 |    min |
|                      Cumulative refresh time of primary shards |              |           0 |    min |
|                     Cumulative refresh count of primary shards |              |           0 |        |
|              Min cumulative refresh time across primary shards |              |           0 |    min |
|           Median cumulative refresh time across primary shards |              |           0 |    min |
|              Max cumulative refresh time across primary shards |              |           0 |    min |
|                        Cumulative flush time of primary shards |              |     2.20545 |    min |
|                       Cumulative flush count of primary shards |              |        2237 |        |
|                Min cumulative flush time across primary shards |              |     2.20545 |    min |
|             Median cumulative flush time across primary shards |              |     2.20545 |    min |
|                Max cumulative flush time across primary shards |              |     2.20545 |    min |
|                                        Total Young Gen GC time |              |       3.451 |      s |
|                                       Total Young Gen GC count |              |         182 |        |
|                                          Total Old Gen GC time |              |           0 |      s |
|                                         Total Old Gen GC count |              |           0 |        |
|                                                     Store size |              |   1.017e-05 |     GB |
|                                                  Translog size |              | 5.12227e-08 |     GB |
|                                         Heap used for segments |              |           0 |     MB |
|                                       Heap used for doc values |              |           0 |     MB |
|                                            Heap used for terms |              |           0 |     MB |
|                                            Heap used for norms |              |           0 |     MB |
|                                           Heap used for points |              |           0 |     MB |
|                                    Heap used for stored fields |              |           0 |     MB |
|                                                  Segment count |              |           0 |        |
|                                                 Min Throughput | index-append |     40854.3 | docs/s |
|                                                Mean Throughput | index-append |     41656.3 | docs/s |
|                                              Median Throughput | index-append |       41503 | docs/s |
|                                                 Max Throughput | index-append |     42967.4 | docs/s |
|                                        50th percentile latency | index-append |     843.742 |     ms |
|                                        90th percentile latency | index-append |     1026.79 |     ms |
|                                        99th percentile latency | index-append |     1254.06 |     ms |
|                                      99.9th percentile latency | index-append |     1436.25 |     ms |
|                                     99.99th percentile latency | index-append |     2225.36 |     ms |
|                                       100th percentile latency | index-append |     2407.73 |     ms |
|                                   50th percentile service time | index-append |     843.742 |     ms |
|                                   90th percentile service time | index-append |     1026.79 |     ms |
|                                   99th percentile service time | index-append |     1254.06 |     ms |
|                                 99.9th percentile service time | index-append |     1436.25 |     ms |
|                                99.99th percentile service time | index-append |     2225.36 |     ms |
|                                  100th percentile service time | index-append |     2407.73 |     ms |
|                                                     error rate | index-append |           0 |      % |

For comparison, refer to the benchmark results here: #19941

@raghuvanshraj raghuvanshraj requested a review from a team as a code owner November 13, 2025 04:33
@github-actions
Copy link
Contributor

❌ Gradle check result for b354835: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@github-actions
Copy link
Contributor

❌ Gradle check result for 1efdab6: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@raghuvanshraj raghuvanshraj force-pushed the feature/datafusion-logging branch from 1efdab6 to 14a63dc Compare November 14, 2025 05:27
@github-actions
Copy link
Contributor

❌ Gradle check result for 14a63dc: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@raghuvanshraj raghuvanshraj force-pushed the feature/datafusion-logging branch from 14a63dc to b9a9c91 Compare November 17, 2025 07:36
@github-actions
Copy link
Contributor

❌ Gradle check result for b9a9c91: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Ok(())
})();

// If logging fails, fall back to println as last resort
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in which case is the above anticipated to fail?

@raghuvanshraj raghuvanshraj force-pushed the feature/datafusion-logging branch from b9a9c91 to 03d0604 Compare November 28, 2025 09:05
@coderabbitai
Copy link

coderabbitai bot commented Nov 28, 2025

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Tip

📝 Customizable high-level summaries are now available in beta!

You can now customize how CodeRabbit generates the high-level summary in your pull requests — including its content, structure, tone, and formatting.

  • Provide your own instructions using the high_level_summary_instructions setting.
  • Format the summary however you like (bullet lists, tables, multi-section layouts, contributor stats, etc.).
  • Use high_level_summary_in_walkthrough to move the summary from the description to the walkthrough section.

Example instruction:

"Divide the high-level summary into five sections:

  1. 📝 Description — Summarize the main change in 50–60 words, explaining what was done.
  2. 📓 References — List relevant issues, discussions, documentation, or related PRs.
  3. 📦 Dependencies & Requirements — Mention any new/updated dependencies, environment variable changes, or configuration updates.
  4. 📊 Contributor Summary — Include a Markdown table showing contributions:
    | Contributor | Lines Added | Lines Removed | Files Changed |
  5. ✔️ Additional Notes — Add any extra reviewer context.
    Keep each section concise (under 200 words) and use bullet or numbered lists for clarity."

Note: This feature is currently in beta for Pro-tier users, and pricing will be announced later.


Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Contributor

❌ Gradle check result for 03d0604: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

@mgodwan
Copy link
Member

mgodwan commented Nov 28, 2025

LGTM. Please ensure DF E2E tests pass

@bharath-techie bharath-techie merged commit ad3dd8b into opensearch-project:feature/datafusion Nov 28, 2025
7 of 31 checks passed
@bharath-techie
Copy link
Contributor

Test 4 has approximation error, so merging it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants