-
Notifications
You must be signed in to change notification settings - Fork 45
!perf: introduce performance analysis subsystem #661
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello @adamgreloch, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
Summary of Changes
Hello team, gemini-code-assist here to provide a summary of this pull request. This PR introduces a new performance analysis subsystem to the kernel. The primary goal is to provide a framework for various performance monitoring mechanisms, starting with detailed kernel event tracing. A significant change is the generalization of the existing perf_* syscalls to support different modes of operation, including the previously existing thread-based performance monitoring and the new tracing mode. The tracing mode captures events like syscall entries/exits, interrupt handling, lock operations, and scheduler activity, writing them in CTF format to an in-memory circular buffer that can be read from userspace.
Highlights
- New Performance Subsystem: Introduces a dedicated subsystem (
perf/) for performance analysis tools. - Generalized
perf_*Syscalls: The existingperf_start,perf_read, andperf_finishsyscalls are modified to accept aperf_mode_targument, allowing them to control different performance monitoring modes. The previous functionality is preserved underperf_mode_threads. - Kernel Event Tracing (
perf_mode_trace): Adds a new tracing mode that captures various kernel events (syscalls, interrupts, locks, scheduler) and outputs them in Common Trace Format (CTF). - CTF Output and In-Memory Buffer: Trace events are written to a 4MB in-memory circular buffer using a new buffer implementation (
perf/buffer-mem.c). The data format conforms to the CTF metadata defined inperf/tsdl/metadata. - Instrumentation: Instrumentation points are added in key kernel areas (interrupt dispatcher, scheduler, thread creation/termination, lock operations, syscall dispatcher) to record tracing events.
Changelog
Click here to see the changelog
- Makefile
- Includes the new
perf/Makefileto integrate the performance subsystem into the build.
- Includes the new
- hal/ia32/interrupts.c
- Includes
perf/events.h. - Adds
trace_irqsflag to control interrupt tracing. - Adds calls to
perf_traceEventsInterruptEnter(line 201) andperf_traceEventsInterruptExit(line 217) withininterrupts_dispatchIRQwhen tracing is enabled. - Adds
_hal_interruptsTracefunction (line 456) to enable/disable interrupt tracing. - Initializes
trace_irqsto 0 in_hal_interruptsInit(line 467).
- Includes
- hal/interrupts.h
- Declares the new
_hal_interruptsTracefunction (line 35).
- Declares the new
- include/perf.h
- New file defining the
perf_mode_tenum (line 20) for different performance modes.
- New file defining the
- main.c
- Includes
perf/perf.h. - Calls
_perf_init(line 112) during kernel initialization.
- Includes
- perf/Makefile
- New file defining the object files for the performance subsystem (
perf.o,trace.o,buffer-mem.o).
- New file defining the object files for the performance subsystem (
- perf/buffer-mem.c
- New file implementing an in-memory circular buffer for trace data.
- Includes functions for buffer initialization (
trace_bufferInit, line 131), starting (_trace_bufferStart, line 81), finishing (_trace_bufferFinish, line 109), reading (_trace_bufferRead, line 97), and writing (_trace_bufferWrite, line 103). - Uses kernel VM functions (
vm_mapFind,vm_pageAlloc,page_map,vm_munmap,vm_pageFree) to manage the buffer memory.
- perf/buffer.h
- New file declaring the interface for the trace event buffer.
- perf/events.h
- New file defining CTF event IDs (line 23) for various kernel events.
- Provides inline helper functions (e.g.,
perf_traceEventsLockSet,perf_traceEventsInterruptEnter,perf_traceEventsThreadCreate,perf_traceEventsSyscallEnter,perf_traceEventsSchedEnter) to write specific event data to the trace buffer usingperf_traceEventsWrite.
- perf/perf.c
- New file implementing the main performance subsystem interface.
- Includes
trace.handperf.h. - Implements
_perf_init(line 21),perf_start(line 27),perf_read(line 40), andperf_finish(line 53), dispatching calls based on theperf_mode_targument.
- perf/perf.h
- New file declaring the main performance subsystem interface functions.
- perf/trace.c
- New file implementing the CTF tracing backend.
- Includes
buffer.h,events.h, andtrace.h. - Implements
_perf_traceInit(line 152),perf_traceStart(line 82),perf_traceRead(line 108),perf_traceFinish(line 125), andperf_traceIsRunning(line 76). - Handles writing events with timestamps (
perf_traceEventsWrite, line 53) using the trace buffer. - Includes logic for detecting timer non-monotonicity and buffer overflows.
- perf/trace.h
- New file declaring the interface for the trace backend.
- perf/tsdl/metadata
- New file defining the CTF metadata for the kernel trace stream.
- Defines data types, clock information, stream structure, and the fields for each traced event type (lock, interrupt, thread, syscall, scheduler).
- proc/process.c
- Includes
perf/events.h. - Adds
process_getNamefunction (line 1730) to retrieve a process's name from its path or arguments.
- Includes
- proc/process.h
- Declares the new
process_getNamefunction (line 82).
- Declares the new
- proc/threads.c
- Includes
perf/events.h. - Adds calls to thread-related trace event functions (
perf_traceEventsThreadScheduling, etc.) in the_perf_eventhelpers (lines 148, 155, 162, 169). - Renames the old
perf_start,perf_read,perf_finishfunctions toperf_threadsStart(line 339),perf_threadsRead(line 368), andperf_threadsFinish(line 380) respectively. - Adds calls to
perf_traceEventsSchedEnter(line 680) andperf_traceEventsSchedExit(line 686) inthreads_schedule. - Adds calls to
perf_traceEventsThreadCreate(line 839) inproc_threadCreateandperf_traceEventsThreadEnd(line 981) inproc_threadEnd. - Adds calls to
perf_traceEventsLockSet(line 1584) in_proc_lockSetandperf_traceEventsLockClear(line 1691) in_proc_lockUnlock. - Updates
proc_threadsList(line 1937) to use the newprocess_getNamefunction (line 1982).
- Includes
- proc/threads.h
- Renames the declarations for the old
perf_*functions toperf_threadsStart(line 97),perf_threadsRead(line 100), andperf_threadsFinish(line 103).
- Renames the declarations for the old
- syscalls.c
- Includes new performance-related headers (
include/perf.h,perf/perf.h,perf/trace.h,perf/events.h). - Modifies
syscalls_perf_start(line 489),syscalls_perf_read(line 503), andsyscalls_perf_finish(line 520) to accept theperf_mode_targument. - Adds calls to
perf_traceEventsSyscallEnter(line 1915) andperf_traceEventsSyscallExit(line 1919) insyscalls_dispatch.
- Includes new performance-related headers (
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.
| Feature | Command | Description |
|---|---|---|
| Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
| Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
| Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
| Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
A kernel's heart, a hidden beat,
Now traced in time, a data treat.
From syscall's call to IRQ's flight,
Performance shines in new-found light.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a comprehensive performance analysis subsystem, which is a significant and valuable addition. The generalization of existing perf_* syscalls and the implementation of CTF-based event tracing are well-executed. The code is generally clear, well-structured, and adheres to common C kernel development practices.
I've noted a couple of areas for improvement below, primarily concerning code maintainability and a performance consideration that aligns with an existing TODO. Given the scope and impact of these changes, especially the breaking change to syscalls, thorough testing is crucial. While manual testing is noted, considering automated tests for this new subsystem would be beneficial for long-term stability and maintenance.
Summary of Findings
- Code Duplication: The
_trace_bufferFinishfunction inperf/buffer-mem.cduplicates logic fromtrace_bufferFreefor deallocating buffer memory. This could be refactored to calltrace_bufferFree. - Performance of Lock Event Tracing: Copying lock names on every lock/unlock event in
perf/events.hcan be inefficient. An existing TODO comment rightly points this out, suggesting a more optimized approach. - Unused struct members (Low Severity - Not directly commented): The
buffer_commonstruct inperf/buffer-mem.ccontainslastTimestampandprevfields (lines 27, 31) that are not used within that file. Similarly,trace_common.lastTimestampinperf/trace.c(line 21) appears unused. These might be leftovers and could be removed if truly unnecessary to avoid confusion.
Merge Readiness
This pull request introduces a significant and valuable performance analysis subsystem. The changes are generally well-implemented. However, there are a couple of medium-severity issues related to code duplication and a performance consideration (already noted by a TODO) that should ideally be addressed before merging to improve maintainability and efficiency.
Given the breaking nature of the syscall changes and the introduction of a new subsystem, ensuring comprehensive testing (including consideration for automated tests) is highly recommended.
I am unable to approve pull requests directly. Please ensure these points are considered and further review is conducted if necessary before merging.
88049ce to
941f35d
Compare
7ea33bb to
04541a2
Compare
d589296 to
bf8b770
Compare
ffe0245 to
74d7e22
Compare
74d7e22 to
e313a33
Compare
831a2a4 to
a784c80
Compare
ee43a1c to
a612849
Compare
ed4c373 to
40eaa97
Compare
9331c58 to
8fcb7fe
Compare
a8729ae to
7099f23
Compare
e1ab161 to
5c11e27
Compare
5c11e27 to
7d212f3
Compare
etiaro
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few minor suggestions, apart from that looks good.
7d212f3 to
cf7daa1
Compare
cf7daa1 to
52dd115
Compare
52dd115 to
5273440
Compare
…ction JIRA: RTOS-1057
Adds a constant memory iterator over running threads that accepts a custom threadinfo callback. It is a generalization of proc_threadsList that makes it possible, e.g., to write the threadinfo struct directly to RTT without allocating memory for an array of threadinfos (as would be the case if using proc_threadsList). JIRA: RTOS-1057
Introduces a subsystem intended for various performance analysis mechanisms: tracing, sample-based methods, PMU utilization, etc. Breaking changes: * This commit generalizes the `perf_*` syscalls to support various perf modes * Previous perf semantics are preserved in the `perf_mode_threads` mode * The return value of `perf_start` on success changes from 0 to a non-negative number denoting count of channels that shouold be read by `perf_read`. * `perf_read` accepts additional `chan` argument denoting an ID of a channel to be read (a mode could serve multiple read-only channels) The subsystem currently implements `perf_mode_trace` mode - a mechanism for tracing syscalls, interrupts, locks and scheduler events. The events are outputed in CTF format to a set of cbuffers that can be retrieved from userspace via `perf_read(perf_mode_trace, ..., chan)` and later processed by 3rd party libraries like babeltrace2. This commit also adds a `perf_stop` syscall fo stopping the perf action without freeing the internal buffers, to allow reading them post-factum and not disturb the trace with `perf_read` calls while it is being recorded. The trace can be gathered in start-stop fashion or in a rolling-window fashion, if `perf_start` is supplied with `PERF_TRACE_FLAG_ROLLING` flag. In the rolling-window fashion, the trace buffer write ops will discard the oldest events if the buffers are full. JIRA: RTOS-1057
(Old) threads perf mode is redundant as its functionality is realized by trace mode JIRA: RTOS-1057
JIRA: RTOS-1057
Introduces a subsystem intended for various performance analysis
mechanisms: tracing, sample-based methods, PMU utilization, etc.
Description
Breaking changes:
perf_*syscalls to support various perfmodes
perf_starton success changes from 0 to anon-negative number denoting count of channels that shouold be read by
perf_read.perf_readaccepts additionalchanargument denoting an ID of achannel to be read (a mode could serve multiple read-only channels)
The subsystem currently implements
perf_mode_tracemode - a mechanismfor tracing syscalls, interrupts, locks and scheduler events. The events
are outputed in CTF format to a set of cbuffers that can be retrieved
from userspace via
perf_read(perf_mode_trace, ..., chan)and laterprocessed by 3rd party libraries like babeltrace2.
This commit also adds a
perf_stopsyscall fo stopping the perf actionwithout freeing the internal buffers, to allow reading them post-factum
and not disturb the trace with
perf_readcalls while it is beingrecorded.
The trace can be gathered in start-stop fashion or in a rolling-window
fashion, if
perf_startis supplied withPERF_TRACE_FLAG_ROLLINGflag.In the rolling-window fashion, the trace buffer write ops will discard
the oldest events if the buffers are full.
Motivation and Context
Types of changes
How Has This Been Tested?
Checklist:
Special treatment
!perf: generalize perf syscalls libphoenix#415