Skip to content

Conversation

@inikep
Copy link
Collaborator

@inikep inikep commented Dec 5, 2025

  1. Refactors the JSON serialization logic within AuditJsonHandler to guarantee strict compliance with the JSON standard by eliminating trailing commas and centralizing field separation control.

The previous approach of appending ", " after every value handler was inconsistent and required error-prone comma removal logic in EndObject.

This change adopts a safer, state-driven approach:

  • Centralized Comma Management: The responsibility for adding the comma separator is moved entirely from the value handlers (Int, String, etc.) to the Key() handler.
  • State-Driven Separation: The new m_is_first_field state flag, set in StartObject() and checked/updated in Key(), ensures a comma is prepended only when necessary (i.e., not for the first field), thereby naturally preventing trailing commas within objects.
  • Inter-Event Separation: Confirmed that the ,\n separator is correctly appended to separate top-level audit event objects in the array.
  1. The audit_log_read() UDF was failing to respect the max_array_length parameter, returning all records instead of the specified limit. This was caused by the read loop not checking the is_batch_end flag.

Additionally, attempting to read the remaining records in a subsequent call caused an infinite loop or parsing errors. This occurred because a new rapidjson::Reader was created for each call, losing the internal state required to resume parsing mid-stream (e.g., handling the comma separator between array elements).

The fix involves:

  • Respecting the is_batch_end flag in the AuditLogReader::read loop to stop processing when the limit is reached.
  • Storing the rapidjson::Reader instance within AuditLogReaderContext to preserve parsing state across multiple audit_log_read() calls.
  • Adding error checking for reader->HasParseError() to prevent infinite loops on malformed data or state mismatches.
  • Updating the udf_audit_log_read_validate_output test case to verify correct behavior for max_array_length.

…tion in audit_log_read()/AuditJsonHandler

Refactors the JSON serialization logic within AuditJsonHandler to guarantee strict compliance with the JSON standard by eliminating trailing commas and centralizing field separation control.

The previous approach of appending ", " after every value handler was inconsistent and required error-prone comma removal logic in EndObject.

This change adopts a safer, state-driven approach:
- Centralized Comma Management: The responsibility for adding the comma separator is moved entirely from the value handlers (Int, String, etc.) to the Key() handler.
- State-Driven Separation: The new m_is_first_field state flag, set in StartObject() and checked/updated in Key(), ensures a comma is prepended only when necessary (i.e., not for the first field), thereby naturally preventing trailing commas within objects.
- Inter-Event Separation: Confirmed that the ,\n separator is correctly appended to separate top-level audit event objects in the array.
@inikep inikep requested a review from dlenev December 5, 2025 08:29
The `audit_log_read()` UDF was failing to respect the `max_array_length` parameter, returning all records instead of the specified limit. This was caused by the read loop not checking the `is_batch_end` flag.

Additionally, attempting to read the remaining records in a subsequent call caused an infinite loop or parsing errors. This occurred because a new `rapidjson::Reader` was created for each call, losing the internal state required to resume parsing mid-stream (e.g., handling the comma separator between array elements).

The fix involves:
- Respecting the `is_batch_end` flag in the `AuditLogReader::read` loop to stop processing when the limit is reached.
- Storing the `rapidjson::Reader` instance within `AuditLogReaderContext` to preserve parsing state across multiple `audit_log_read()` calls.
- Adding error checking for `reader->HasParseError()` to prevent infinite loops on malformed data or state mismatches.
- Updating the `udf_audit_log_read_validate_output` test case to verify correct behavior for `max_array_length`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant