-
Notifications
You must be signed in to change notification settings - Fork 1.8k
out_cloudwatch_logs: increase MAX_EVENT_LEN to 1MB with tests #11159
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. WalkthroughUpdated CloudWatch Logs public header: increased Changes
Sequence Diagram(s)Not applicable — changes are macro/documentation updates and new test cases; no runtime control-flow or component interaction changes to diagram. Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (2)
🧰 Additional context used🧠 Learnings (2)📚 Learning: 2025-10-23T07:43:16.216ZApplied to files:
📚 Learning: 2025-09-04T12:35:22.872ZApplied to files:
🧬 Code graph analysis (1)tests/runtime/out_cloudwatch.c (3)
🔇 Additional comments (11)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
7c85150 to
ee95b3b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
plugins/out_cloudwatch_logs/cloudwatch_api.h(1 hunks)
🔇 Additional comments (1)
plugins/out_cloudwatch_logs/cloudwatch_api.h (1)
56-56: I'll search the repository directly for the constants to verify the buffer calculations:Based on my verification, here's the rewritten review comment:MAX_EVENT_LEN increase to 1,000,000 is correct and properly handled.
The increase to 1,000,000 bytes correctly reflects CloudWatch's 2025 update supporting log events up to 1 MB. Verification confirms the buffer logic is sound:
- Individual events are truncated at MAX_EVENT_LEN (cloudwatch_api.c:558-561)
- Payload size is calculated as the sum of all event messages in UTF-8, plus 26 bytes for each log event, and the code conservatively uses 42 bytes per event
- Batch size check (cloudwatch_api.c:891) prevents payloads from exceeding PUT_LOG_EVENTS_PAYLOAD_SIZE
- Maximum single event with overhead: 1,000,000 + 42 = 1,000,042 bytes fits within the 1,048,576-byte batch limit
No buffer sizing issues detected.
3d7597b to
ee654ea
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (5)
tests/runtime/out_cloudwatch.c (5)
12-14: Avoid duplicatingMAX_EVENT_LENand keep tests tied to the real limitHere you reintroduce
MAX_EVENT_LENlocally but it’s not used anywhere in this file, and the actual definition already lives incloudwatch_api.h. That creates a drift risk if the plugin constant changes again.Consider either:
- Including the CloudWatch API header and using its
MAX_EVENT_LENdirectly in the tests, or- Wiring these tests to use this macro (e.g.,
create_large_json_message(MAX_EVENT_LEN - 1000), constructing “over-limit” payloads asMAX_EVENT_LEN + delta) instead of hard‑coded sizes later in the file.
399-430:create_large_json_messageis bounds-safe but relies on magic offsetsThe allocation and index math look safe (no obvious buffer overruns), but the use of hard-coded values (
12for the prefix offset,15for the “overhead” size) makes this helper fragile if the JSON structure ever changes.You could make this more robust and self-documenting by defining prefix/suffix strings and using
strlenfor lengths, e.g.const char *prefix = "{\"message\":\""; const char *suffix = "\"}";and computing offsets from those instead of literal12/15.
432-468: Nice reuse of CloudWatch setup; consider minimal defensive checksThis helper cleanly de-duplicates the common setup/teardown logic and pushes arbitrary JSON payloads via the existing lib input/output harness. Functionally it’s consistent with the rest of the file.
If you want to harden it slightly, you could add a
TEST_CHECK(ctx != NULL);afterflb_create()to fail fast on allocation/init errors, but that’s in line with (and optional relative to) existing tests.
470-498: UseMAX_EVENT_LEN(or AWS max) instead of hard-coded sizes in new large-event testsBoth
flb_test_cloudwatch_event_size_near_limitandflb_test_cloudwatch_event_size_at_aws_maxcurrently hard-code999000and1048500while the file definesMAX_EVENT_LENand the PR is about aligning with CloudWatch’s updated maximum.To keep these tests robust against future limit changes and better reflect their intent:
- For the “near limit” test, derive the size from
MAX_EVENT_LEN(e.g.,MAX_EVENT_LEN - 1000or similar), rather than999000directly.- For the “AWS max” test, consider either:
- Using a named constant that documents how it relates to the AWS limit (e.g.,
AWS_MAX_EVENT_SIZE_APPROX), or- Deriving it from
MAX_EVENT_LENplus a clear delta that documents “this should be truncated down toMAX_EVENT_LEN”.That would remove magic numbers and ensure these tests stay aligned with the semantics they’re meant to cover.
500-525: Backslash-boundary test indexing looks correct but could reuse the JSON helperThe manual construction of the backslash-heavy payload appears bounds-safe: the loop limit (
i < 999980), the base offset (12), and the subsequentstrcpy(large_json + 12 + 999980, "\"}")all fit well within theflb_malloc(1000100)buffer.To reduce the amount of manual index arithmetic and keep the JSON shape consistent with other tests, you could consider building the bulk of the message via
create_large_json_message()and then post-processing the payload (e.g., walking the message body and replacing selected characters with\\) instead of re-encoding `{"message":"..."} by hand here.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
plugins/out_cloudwatch_logs/cloudwatch_api.h(1 hunks)tests/runtime/out_cloudwatch.c(3 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- plugins/out_cloudwatch_logs/cloudwatch_api.h
🧰 Additional context used
🧬 Code graph analysis (1)
tests/runtime/out_cloudwatch.c (3)
plugins/out_s3/s3.c (1)
setenv(57-60)src/flb_lib.c (9)
flb_create(143-225)flb_input(266-276)flb_input_set(305-335)flb_output(279-289)flb_output_set(520-551)flb_start(983-994)flb_lib_push(843-870)flb_stop(1011-1055)flb_destroy(228-263)include/fluent-bit/flb_mem.h (1)
flb_free(126-128)
🔇 Additional comments (1)
tests/runtime/out_cloudwatch.c (1)
539-541: Registering new tests inTEST_LISTlooks correctThe new tests are properly added to
TEST_LISTwith descriptive names and the required{NULL, NULL}terminator remains intact. This should integrate cleanly with the existing runtime test harness.
ee654ea to
3bf4539
Compare
3bf4539 to
bb01b00
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
tests/runtime/out_cloudwatch.c (1)
403-437: Consider safer string operations and add documentation.The function is correct but could be improved:
strcpy usage (lines 424, 433): Consider using
memcpyorstrncpyfor explicit length control, even though the current usage is safe.Silent size adjustment (lines 414-416): The function silently adjusts
target_sizeif it's too small. Consider returningNULLor documenting this behavior.Missing documentation: Add a comment noting that the caller must free the returned pointer.
Example improvement for the strcpy calls:
- strcpy(json, TEST_JSON_PREFIX); + memcpy(json, TEST_JSON_PREFIX, prefix_len);- strcpy(json + prefix_len + data_size, TEST_JSON_SUFFIX); + memcpy(json + prefix_len + data_size, TEST_JSON_SUFFIX, suffix_len);
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
plugins/out_cloudwatch_logs/cloudwatch_api.h(1 hunks)tests/runtime/out_cloudwatch.c(3 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
tests/runtime/out_cloudwatch.c (3)
plugins/out_s3/s3.c (1)
setenv(57-60)src/flb_lib.c (9)
flb_create(143-225)flb_input(266-276)flb_input_set(305-335)flb_output(279-289)flb_output_set(520-551)flb_start(983-994)flb_lib_push(843-870)flb_stop(1011-1055)flb_destroy(228-263)include/fluent-bit/flb_mem.h (1)
flb_free(126-128)
🔇 Additional comments (7)
plugins/out_cloudwatch_logs/cloudwatch_api.h (2)
46-56: LGTM! Conservative limit with clear documentation.The 1MB limit with ~4.6% safety margin is a sound approach for production reliability. The comment clearly documents the rationale and empirical testing results.
60-60: Good addition for maintainability.Defining the prefix length as a constant improves code clarity and reduces magic numbers.
tests/runtime/out_cloudwatch.c (5)
8-9: LGTM! Correct header inclusion.Including the CloudWatch API header provides access to MAX_EVENT_LEN and other constants needed for the new tests.
15-17: LGTM! Clear test structure constants.These constants appropriately define the JSON envelope for constructing test messages.
439-476: LGTM! Standard test helper implementation.The function properly sets up the test environment, handles optional data injection, and ensures cleanup.
478-534: LGTM! Comprehensive edge-case coverage.The three test functions properly exercise large-event handling:
- At-limit events
- Over-limit events requiring truncation
- Backslash handling at truncation boundaries (important for JSON escaping)
Memory management is correct in all tests. The tests follow the existing pattern of integration-style validation without explicit truncation assertions.
Note: The test names in code (
event_size_at_limit,event_size_over_limit) differ from those mentioned in the PR objectives (event_size_near_limit,event_size_at_aws_max). The current names are more descriptive and accurate.
548-550: LGTM! Proper test registration.The three new tests are correctly registered in the test suite with names matching their function declarations.
bb01b00 to
f844d36
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (4)
tests/runtime/out_cloudwatch.c (4)
403-438: Large JSON generator is correct; minor optional simplificationThe helper correctly:
- enforces a minimum
target_sizeto hold prefix + at least one data byte + suffix,- allocates
target_size + 1and null‑terminates atjson[target_size],- fills the payload so that
strlen(json) == target_size.If you ever want to shave a few lines, the data fill loop could be replaced with a
memsetstarting atjson + prefix_lenfordata_sizebytes, but that’s purely cosmetic in a test helper.
440-477: Shared CloudWatch test harness looks consistent with existing patterns
run_cloudwatch_test_with_datamirrors the setup/teardown logic of the earlier tests (env flag, lib input, cloudwatch_logs output, start/push/stop/destroy) and correctly frees resources on the happy path.Optionally, you could assert on the return value of
flb_lib_pushto catch immediate push failures, but given the existing tests do not, this is non‑blocking.
479-492: Limit-size test correctly drives a MAX_EVENT_LEN‑sized payload through the pipeline
create_large_json_message(MAX_EVENT_LEN)followed bystrlen(large_json)ensures the pushed payload is exactlyMAX_EVENT_LENbytes long, which is what the truncation logic should treat as in‑bounds.The test currently asserts only that setup succeeds and the pipeline runs without crashing, which matches the style of existing runtime tests here. If you later add hooks to inspect the mock AWS backend, this would be a good place to assert that the event was not truncated.
494-507: Over-limit test exercises truncation path, though assertions are coarseUsing
create_large_json_message(MAX_EVENT_LEN + 1)ensures the input exceeds the per‑event cap and should force the CloudWatch code down the truncation path.As with the previous test, this currently validates “no crash / no fatal error” rather than inspecting the truncated size. That’s acceptable for a runtime smoke test, but if you later expose the observed payload length from the mock backend, asserting that it equals
MAX_EVENT_LENwould make this test much stronger.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
plugins/out_cloudwatch_logs/cloudwatch_api.h(1 hunks)tests/runtime/out_cloudwatch.c(3 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
tests/runtime/out_cloudwatch.c (3)
plugins/out_s3/s3.c (1)
setenv(57-60)src/flb_lib.c (9)
flb_create(143-225)flb_input(266-276)flb_input_set(305-335)flb_output(279-289)flb_output_set(520-551)flb_start(983-994)flb_lib_push(843-870)flb_stop(1011-1055)flb_destroy(228-263)include/fluent-bit/flb_mem.h (1)
flb_free(126-128)
🔇 Additional comments (2)
tests/runtime/out_cloudwatch.c (2)
8-17: Using shared constants for JSON shape and MAX_EVENT_LEN is a good choiceIncluding
cloudwatch_api.hand definingTEST_JSON_PREFIX/TEST_JSON_SUFFIXkeeps the tests aligned with the real CloudWatch event constraints and avoids duplicated magic numbers. No issues here.
549-551: New tests are correctly registered in TEST_LISTThe three new test cases are wired into
TEST_LISTwith clear names matching their purposes, so they will run under the existing runtime test harness without further changes.
Increase MAX_EVENT_LEN from 262,118 bytes (256 KiB) to 1,000,000 bytes (1 MB) to better align with AWS CloudWatch's documented maximum event size of 1,048,576 bytes (1 MiB). The 1 MB limit provides a ~4.6% safety margin to account for JSON encoding overhead. Testing confirmed messages up to 1,048,546 bytes (encoding to 1,048,586 bytes) succeed, though we use a conservative limit for production safety. Add runtime tests to validate the new limit: - event_size_at_limit: Validates events at exactly MAX_EVENT_LEN (1MB) are accepted - event_size_over_limit: Validates events exceeding MAX_EVENT_LEN are truncated - event_truncation_with_backslash: Validates backslash handling at truncation boundary Signed-off-by: Shelby Hagman <[email protected]>
f844d36 to
553898e
Compare
Summary
CloudWatch logs PutLogEvents API change in 2025 to support events up to 1MB (ref) while our defaults reflect the older value 256KB.
Add runtime tests to validate the new limit:
Enter
[N/A]in the box, if an item is not applicable to your change.Testing
Before we can approve your change; please submit the following in a comment:
Gist: https://gist.github.com/ShelbyZ/76bfac405e03723efeb532a48d5b6a87
Over-limit on current logs
New limit
Tested directly against CloudWatch Logs via PutLogEvents to discover that a maximum value of
1,048,546bytes could be sent without issue. Using 1,000,000 bytes (1MB) gives a safe margin to avoid calculations for thePUT_LOG_EVENTS_PAYLOAD_SIZEwhich would need reevaluation to support observed max. The encoded max size was also1,048,586bytes which is 10 bytes about 1MiB (1,048,576) which makes me worry that depending on an exact max may break over time.If this is a change to packaging of containers or native binaries then please confirm it works for all targets.
ok-package-testlabel to test for all targets (requires maintainer to do).Documentation
Backporting
Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.
Summary by CodeRabbit
Improvements
Tests
✏️ Tip: You can customize this high-level summary in your review settings.