Skip to content

Conversation

@etiaro
Copy link

@etiaro etiaro commented May 19, 2025

Description

Implement parser for phoenix-rtos/phoenix-rtos-utils#242 that allows post-mortem debugging of crashed process using GDB.

Motivation and Context

RTOS-1043 Research core dump and backtrace techniques
RTOS-1049 Implement core dump on MMU targets
RTOS-1062 Implement core dump on NOMMU targets

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Chore (refactoring, style fixes, git/CI config, submodule management, no code logic changes)

How Has This Been Tested?

  • Already covered by automatic testing.
  • New test added: (add PR link here).
  • Tested by hand on: (list targets here).

Checklist:

  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have added tests to cover my changes.
  • All new and existing linter checks and tests passed.
  • My changes generate no new compilation warnings for any of the targets.

Special treatment

  • This PR needs additional PRs to work (list the PRs, preferably in merge-order).
  • I will merge this PR by myself when appropriate.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @etiaro, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

Summary of Changes

Hello! Gemini here, providing a summary of this pull request.

This PR introduces a new host-side utility, coredump_parser, designed to process coredump data generated by the Phoenix RTOS kernel (specifically the format implemented in the linked PR #649). The primary goal is to enable post-mortem debugging of crashed processes on the target system using standard host tools like GDB.

The parser reads encoded coredump data from standard input, which is expected to be in a specific format including start/end markers, process information, Base64 encoding, and Run-Length Encoding (RLE). It decodes the data, performs a CRC32 validation to ensure integrity, and then writes the raw memory dump to a file in a specified output directory. The utility is designed to handle multiple coredumps streamed via stdin and generates unique filenames for each dump.

Highlights

  • New Host Utility: Adds a new executable coredump_parser to the project, intended to run on the host machine.
  • Coredump Parsing Logic: Implements the core logic for reading, decoding (Base64 and RLE), and validating (CRC32) Phoenix RTOS coredump data streamed via stdin.
  • File Output for Debugging: Writes the decoded raw memory data to a file in a user-specified directory, formatted for use with debuggers like GDB.
  • Handles Multiple Dumps: The parser can process multiple coredumps from a single input stream, generating unique output files for each.

Changelog

Click here to see the changelog
  • coredump_parser/Makefile
    • Added a new Makefile to build the coredump_parser executable.
  • coredump_parser/parser.cpp
    • Added the main C++ source file for the coredump parser.
    • Includes functions for Base64 decoding (b64_index, read_b64).
    • Includes function for Run-Length Encoding (RLE) decoding (decode_rle).
    • Includes function for CRC32 validation (check_crc).
    • Implements logic to read from stdin, detect coredump markers (_COREDUMP_START_, _COREDUMP_END_), extract process info, and call decoding/validation functions (watch_stdin, read_decode).
    • Includes function to write the decoded memory data to a file (parse_dump).
    • Includes function to generate unique output filenames based on process name, CRC, and index (get_output_path).
    • Contains the main function for argument parsing and the main processing loop.
  • coredump_parser/readme.md
    • Added a README file explaining how to use the coredump_parser utility.
    • Provides instructions for debugging using gdb-multiarch.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.


Core dump arrives,
Base64, RLE unwinds,
Debug begins now.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a parser for Phoenix coredumps, enabling post-mortem debugging with GDB. This is a valuable addition for diagnostics.

The implementation covers base64 decoding, RLE decoding, and CRC checking. The overall structure is logical. However, I've identified a few issues, including some critical and high-severity ones related to potential out-of-bounds memory access and error handling, that should be addressed before merging. There are also some medium-severity suggestions for improving code robustness and clarity.

No explicit style guide was provided, so feedback is based on common C++ best practices focusing on correctness, maintainability, and safety.

Summary of Findings

  • Critical: Out-of-Bounds Read in CRC Check Debug Message: A debug print statement in check_crc (parser.cpp:121) attempts an unsafe memory access by trying to read the CRC from a data buffer that has already had the CRC removed. This will lead to an out-of-bounds read and likely crash if the remaining data size is less than 4 bytes.
  • High: Potential Out-of-Bounds Read During CRC Extraction: In read_decode (parser.cpp:141), the CRC is extracted from the decoded data. If the decoded data is shorter than 4 bytes, this operation will read out of bounds, causing a crash or undefined behavior. A size check is needed.
  • High: Shadowed Loop Variable: In check_crc (parser.cpp:113), an inner loop variable i shadows an outer loop variable i. This is poor practice and can lead to confusion or bugs.
  • Medium: Type Punning for CRC Extraction: In read_decode (parser.cpp:141), type punning is used to convert bytes to uint32_t for CRC extraction. Using memcpy is a safer and more standards-compliant approach.
  • Medium: Ambiguous EOF Return Code: The watch_stdin function (parser.cpp:172) returns EOF. It would be clearer to use a custom status code (perhaps from an enum) to distinguish clean end-of-file from other error conditions.
  • Medium: Integer Error Codes vs. Enum Class: Error codes (parser.cpp:8-11) are defined as const int. Using an enum class would provide better type safety and semantic grouping.
  • Low: Copyright Year in Makefile (Not Commented): The Makefile (coredump_parser/Makefile:4) has a copyright year of 2025, which is likely a typo. (Severity: low, not commented due to review settings)
  • Low: Use of C Headers (Not Commented): The C header <stdio.h> is used (parser.cpp:5). In C++, <cstdio> is generally preferred for C standard library headers. (Severity: low, not commented due to review settings)
  • Low: Error Messages Stream (Not Commented): Error messages are printed to std::cout. std::cerr is the standard stream for error messages. (Severity: low, not commented due to review settings)
  • Low: C-style Typedef Struct (Not Commented): The C-style typedef struct { ... } additional_info; (parser.cpp:14-18) is used. In C++, struct additional_info { ... }; is sufficient. (Severity: low, not commented due to review settings)

Merge Readiness

This pull request introduces a significant and useful feature. However, due to the presence of critical and high-severity issues, particularly those related to potential out-of-bounds memory access which could lead to crashes, I recommend that these issues be addressed before the pull request is merged. The medium-severity suggestions should also be considered for improving the overall quality and robustness of the code.

I am not authorized to approve pull requests, but based on this review, changes are requested.

@etiaro etiaro force-pushed the etiaro/coredump-parser branch from 887478f to f13f1c5 Compare May 19, 2025 13:49
@etiaro etiaro marked this pull request as ready for review May 19, 2025 14:06
@etiaro etiaro requested a review from Darchiv May 19, 2025 14:06
@etiaro etiaro force-pushed the etiaro/coredump-parser branch from f13f1c5 to c9f50a3 Compare May 20, 2025 09:31
@etiaro etiaro force-pushed the etiaro/coredump-parser branch 2 times, most recently from 9b4cdf5 to f803dbc Compare June 5, 2025 15:19
@etiaro etiaro force-pushed the etiaro/coredump-parser branch from f803dbc to 168ec1c Compare June 5, 2025 15:22
@etiaro etiaro force-pushed the etiaro/coredump-parser branch from 168ec1c to 398402d Compare June 5, 2025 15:25
@etiaro etiaro force-pushed the etiaro/coredump-parser branch from 00e0171 to 82adfdd Compare August 13, 2025 12:08
@etiaro etiaro force-pushed the etiaro/coredump-parser branch from 82adfdd to 49183c6 Compare August 13, 2025 12:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants