Skip to content

Conversation

@etiaro
Copy link
Contributor

@etiaro etiaro commented Aug 11, 2025

Description

Introduces kernel thread that uses messaging to inform coredump server process about crashed processes and provide any data required for creating coredump for post-mortem analysis with GDB.

Motivation and Context

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Chore (refactoring, style fixes, git/CI config, submodule management, no code logic changes)

How Has This Been Tested?

  • Already covered by automatic testing.
  • New test added: (add PR link here).
  • Tested by hand on: ia32-generic-qemu, armv7r5f-zynqmp-qemu, sparcv8leon-generic-qemu

Checklist:

  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have added tests to cover my changes.
  • All new and existing linter checks and tests passed.
  • My changes generate no new compilation warnings for any of the targets.

Special treatment

@etiaro etiaro changed the title Etiaro/coredump server coredump: coredump server support Aug 11, 2025
Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary of Changes

Hello @etiaro, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a coredump server functionality to the Phoenix-RTOS kernel. Its primary purpose is to capture the state of a crashed process, including its memory segments, threads, and context, to facilitate post-mortem debugging. This is achieved by integrating the coredump mechanism into the kernel's exception and process management systems, ensuring that fatal errors trigger a coredump rather than just halting the system.

Highlights

  • Coredump Server Implementation: A new coredump server is added, capable of enqueuing crashed processes and providing their memory and thread information upon request.
  • Enhanced Exception Handling: The kernel's exception handlers across various architectures are updated to initiate a coredump sequence instead of simply halting the CPU on fatal exceptions.
  • Process and Thread Management Integration: Core process and thread management routines are modified to support the coredump lifecycle, including marking processes for coredump, handling thread termination during coredump, and managing memory deallocation.
  • Build System Updates: The build system is updated to compile and link the new coredump server components into the kernel.
  • Context and Relocation Handling: Improvements are made to how CPU contexts are saved and restored, and relocation information for processes (especially for NOMMU builds) is now accessible for coredump purposes.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces coredump server support, a significant feature involving extensive changes across exception handling, process management, and thread lifecycle. The core logic is implemented in the new coredump module. While the overall approach is sound, I've identified several critical issues, including potential NULL pointer dereferences and a memory leak in proc_join. I've also provided suggestions to improve robustness and fix bugs in the coredump data handling. Please address these points to ensure the stability of this new functionality.

@etiaro etiaro force-pushed the etiaro/coredump-server branch 3 times, most recently from 5660cd1 to 194e1b9 Compare August 12, 2025 15:19
@etiaro etiaro force-pushed the etiaro/coredump-server branch 4 times, most recently from 1ec9788 to 4634bcc Compare August 13, 2025 12:53
@etiaro etiaro marked this pull request as ready for review August 13, 2025 12:53
@etiaro etiaro force-pushed the etiaro/kernel-sighandlers branch from f7f2833 to 1089083 Compare August 18, 2025 13:04
@etiaro etiaro force-pushed the etiaro/coredump-server branch 4 times, most recently from 448164a to 29d3166 Compare August 22, 2025 12:45
@etiaro etiaro force-pushed the etiaro/kernel-sighandlers branch from 1089083 to 4b21ec0 Compare August 22, 2025 12:45
@etiaro etiaro force-pushed the etiaro/kernel-sighandlers branch from 4b21ec0 to df5951b Compare October 17, 2025 10:25
@etiaro etiaro force-pushed the etiaro/coredump-server branch from 29d3166 to 301c469 Compare October 20, 2025 16:01
Comment on lines 36 to 52
thread_t *sender;
spinlock_ctx_t sc;

p = proc_portGet(port);
if (p == NULL) {
return -EINVAL;
}

sender = proc_current();

kmsg.msg = msg;
kmsg.src = sender->process;
kmsg.src = senderProc;
kmsg.threads = NULL;
kmsg.state = msg_waiting;

kmsg.msg->pid = (sender->process != NULL) ? process_getPid(sender->process) : 0;
kmsg.msg->priority = sender->priority;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't this also be just senderProc instead of sender->proc? It seems that calling it for different process might give the msg wrong pid and priority

Copy link
Contributor Author

@etiaro etiaro Nov 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that the naming is really confusing here.
Primary objective of this modification was to send data from senderProc's address space (for MMU targets, here it doesn't have much effect).
The messaging itself is handled by sender (proc_current()), so in my opinion pid and priority should remain as it is.

That is probably not a great design choice, I'll try to change this function parameter to source map instead of process to show more directly what it does.

Comment on lines 346 to 366
thread_t *sender;
spinlock_ctx_t sc;

/* TODO - check if msg pointer belongs to user vm_map */
if (msg == NULL) {
return -EINVAL;
}

p = proc_portGet(port);
if (p == NULL) {
return -EINVAL;
}

sender = proc_current();

hal_memcpy(&kmsg.msg, msg, sizeof(msg_t));
kmsg.src = sender->process;
kmsg.src = senderProc;
kmsg.threads = NULL;
kmsg.state = msg_waiting;

kmsg.msg.pid = (sender->process != NULL) ? process_getPid(sender->process) : 0;
kmsg.msg.priority = sender->priority;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here about senderProc and sender->proc

@etiaro etiaro force-pushed the etiaro/coredump-server branch from 301c469 to 14f63ce Compare November 4, 2025 16:49
@etiaro etiaro force-pushed the etiaro/coredump-server branch from 14f63ce to 7f22143 Compare November 4, 2025 16:57
* add proc_sendFromMap to send messages with data from given map
* introduce include/coredump.h with coredump server messages structures
* introduce coredump thread with queue of crashed threads to dump
* introduce proc_crash function that sets up thread for coredumping and kills the process
* delay kstack kfree to ghost removal so that thread context is available during coredumping
* trigger coredump on termination by standard set of signals

JIRA: RTOS-1054
@etiaro etiaro force-pushed the etiaro/coredump-server branch from 7f22143 to 28cea5a Compare November 4, 2025 17:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants