Skip to content

Conversation

@etiaro
Copy link

@etiaro etiaro commented Aug 11, 2025

JIRA: RTOS-1054

Description

Introduces a coredump_server application running in user space responsible for printing and saving coredumps of any crashing processes, allowing for post-mortem analysis with GDB.
Communicates with kernel on predefined port, similarly to klog, using message structures defined in kernel includes.
Utilizes new device /dev/coredumpctrl for runtime configuration by running coredump_server program with configuration options.

Motivation and Context

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Chore (refactoring, style fixes, git/CI config, submodule management, no code logic changes)

How Has This Been Tested?

  • Already covered by automatic testing.
  • New test added: (add PR link here).
  • Tested by hand on: ia32-generic-qemu, armv7r5f-zynqmp-qemu, sparcv8leon-generic-qemu

Checklist:

  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have added tests to cover my changes.
  • All new and existing linter checks and tests passed.
  • My changes generate no new compilation warnings for any of the targets.

Special treatment

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary of Changes

Hello @etiaro, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a new coredump server that automatically captures the state of a crashed process. It generates standard ELF coredump files, which are essential for debugging and understanding system failures. The server is designed with a modular architecture, including a Hardware Abstraction Layer (HAL) for cross-platform compatibility and a flexible settings mechanism to control the scope and output of the coredumps. This enhancement significantly improves the system's diagnostic capabilities by providing detailed crash information.

Highlights

  • New Coredump Server: This pull request introduces a new coredump_server component, designed to capture the state of a crashed process. This server will generate standard ELF-formatted coredump files, providing a comprehensive snapshot for post-mortem debugging.
  • Broad Architecture Support: The server is integrated into the build system across numerous target architectures (e.g., AArch64, ARMv7, IA32, RISC-V, SPARCv8LEON) by adding it to their DEFAULT_COMPONENTS in respective Makefiles. This ensures broad platform support for crash analysis.
  • Configurable Coredump Generation: The coredump server is highly configurable, allowing users to specify parameters such as the maximum number of threads to dump, the memory scope (e.g., only exception thread stack, all stacks, or all memory), whether to include floating-point context, and the maximum chunk size for memory dumps. It also supports saving coredumps to a specified file path or printing them to the console.
  • Architecture-Specific Context Capture (HAL): A new Hardware Abstraction Layer (HAL) has been implemented to handle architecture-specific details of capturing CPU registers, FPU contexts, and other process-related auxiliary notes, ensuring accurate and consistent coredump generation across diverse hardware.
  • Efficient Data Encoding and Integrity: The coredump data is encoded using Base64 and Run-Length Encoding (RLE), and includes a CRC32 checksum for data integrity. This ensures efficient and reliable transfer and storage of the coredump information.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a coredump server, a significant new feature. The implementation is extensive, with support for multiple architectures, and the code is generally well-structured. I have identified a few critical and high-severity issues related to resource management, robustness, and correctness that should be addressed before merging. Specifically, there's a critical bug in main() where a port is destroyed before it's created, and a few cases where the server could crash or exhibit undefined behavior due to improper error handling or resource management. I've also included several medium-severity suggestions to improve maintainability, portability, and code clarity. Overall, this is a solid foundation for the coredump functionality.

@etiaro etiaro force-pushed the etiaro/coredump-server branch 3 times, most recently from d350479 to 8851c58 Compare August 12, 2025 15:20
@etiaro etiaro force-pushed the etiaro/coredump-server branch from 8851c58 to c0e8124 Compare August 12, 2025 16:05
@etiaro etiaro force-pushed the etiaro/coredump-server branch 2 times, most recently from 28d0118 to e92b012 Compare August 13, 2025 13:03
@etiaro etiaro requested review from Darchiv and agkaminski August 13, 2025 13:06
@etiaro etiaro marked this pull request as ready for review August 13, 2025 13:06
@etiaro etiaro force-pushed the etiaro/coredump-server branch 2 times, most recently from 930f45a to 7abf484 Compare August 28, 2025 14:44
Introduces coredump_server executable and psh applet for coredump configuration.

JIRA: RTOS-1054
@etiaro etiaro force-pushed the etiaro/coredump-server branch from 7abf484 to d6776cf Compare August 29, 2025 10:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants