Move timing stats from logs to status response#49
Conversation
- Remove print_stats_thread that was polluting logs every 5 seconds - Add STATS command to unix socket for fetching timing statistics - Add thread-safe mutex protection for plc_timing_stats access - Add get_timing_stats_snapshot() for safe concurrent reads - Add format_timing_stats_response() to format stats as JSON - Add stats_plc() method to RuntimeManager - Modify handle_status() to include timing_stats in response - Maintain backward compatibility: status field unchanged for old editors - Fix missing return statement in stop_plc exception handler - Disable R0902 and R1732 pylint warnings in pre-commit config Co-Authored-By: Thiago Alves <thiagoralves@gmail.com>
🤖 Devin AI EngineerI'll be helping with this pull request! Here's what you should know: ✅ I will automatically:
Note: I can only respond to comments from users who have write access to this repository. ⚙️ Control Options:
|
Use PRId64 from inttypes.h instead of %ld for portable int64_t formatting. This fixes build failures on 32-bit ARM platforms where int64_t is 'long long int' rather than 'long int'. Co-Authored-By: Thiago Alves <thiagoralves@gmail.com>
There was a problem hiding this comment.
Pull request overview
This PR removes periodic stats logging that occurred every 5 seconds and exposes timing statistics through the /api/status endpoint instead. The changes maintain backward compatibility by keeping the existing status field while adding a new timing_stats object.
Key changes:
- Removed the
print_stats_threadthat logged statistics every 5 seconds - Added thread-safe access to timing statistics using mutex locks for safe concurrent access
- Introduced a new
STATScommand in the unix socket interface that returns timing data as JSON
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| webserver/runtimemanager.py | Added stats_plc() method to fetch timing stats via STATS command; fixed missing return in stop_plc() exception handler |
| webserver/app.py | Modified status endpoint to include timing_stats; added parse_timing_stats() helper function |
| core/src/plc_app/unix_socket.c | Added STATS command handler |
| core/src/plc_app/scan_cycle_manager.h | Added function declarations for thread-safe stats access and formatting |
| core/src/plc_app/scan_cycle_manager.c | Implemented mutex-protected stats access; added formatting function using PRId64 macros |
| core/src/plc_app/plc_main.c | Removed print_stats_thread function and associated thread creation/cleanup |
| .pre-commit-config.yaml | Added R0902 and R1732 to disabled pylint warnings |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| } | ||
|
|
||
|
|
||
| def parse_timing_stats(stats_response: Optional[str]) -> Optional[dict]: |
There was a problem hiding this comment.
The function name parse_timing_stats could be more specific. Consider renaming to parse_stats_response to better reflect that it parses the STATS command response format, not just any timing stats.
| return snprintf(buffer, buffer_size, | ||
| "STATS:{" | ||
| "\"scan_count\":0," | ||
| "\"scan_time_min\":null," | ||
| "\"scan_time_max\":null," | ||
| "\"scan_time_avg\":null," | ||
| "\"cycle_time_min\":null," | ||
| "\"cycle_time_max\":null," | ||
| "\"cycle_time_avg\":null," | ||
| "\"cycle_latency_min\":null," | ||
| "\"cycle_latency_max\":null," | ||
| "\"cycle_latency_avg\":null," | ||
| "\"overruns\":0" | ||
| "}\n"); | ||
| } | ||
|
|
||
| return snprintf(buffer, buffer_size, | ||
| "STATS:{" | ||
| "\"scan_count\":%" PRId64 "," | ||
| "\"scan_time_min\":%" PRId64 "," | ||
| "\"scan_time_max\":%" PRId64 "," | ||
| "\"scan_time_avg\":%" PRId64 "," | ||
| "\"cycle_time_min\":%" PRId64 "," | ||
| "\"cycle_time_max\":%" PRId64 "," | ||
| "\"cycle_time_avg\":%" PRId64 "," | ||
| "\"cycle_latency_min\":%" PRId64 "," | ||
| "\"cycle_latency_max\":%" PRId64 "," | ||
| "\"cycle_latency_avg\":%" PRId64 "," | ||
| "\"overruns\":%" PRId64 "}\n", | ||
| snapshot.scan_count, snapshot.scan_time_min, snapshot.scan_time_max, | ||
| snapshot.scan_time_avg, snapshot.cycle_time_min, snapshot.cycle_time_max, | ||
| snapshot.cycle_time_avg, snapshot.cycle_latency_min, snapshot.cycle_latency_max, | ||
| snapshot.cycle_latency_avg, snapshot.overruns); |
There was a problem hiding this comment.
The snprintf return value should be checked to ensure the formatted string wasn't truncated. If the return value >= buffer_size, the output was truncated and may result in malformed JSON being sent to clients.
| return snprintf(buffer, buffer_size, | |
| "STATS:{" | |
| "\"scan_count\":0," | |
| "\"scan_time_min\":null," | |
| "\"scan_time_max\":null," | |
| "\"scan_time_avg\":null," | |
| "\"cycle_time_min\":null," | |
| "\"cycle_time_max\":null," | |
| "\"cycle_time_avg\":null," | |
| "\"cycle_latency_min\":null," | |
| "\"cycle_latency_max\":null," | |
| "\"cycle_latency_avg\":null," | |
| "\"overruns\":0" | |
| "}\n"); | |
| } | |
| return snprintf(buffer, buffer_size, | |
| "STATS:{" | |
| "\"scan_count\":%" PRId64 "," | |
| "\"scan_time_min\":%" PRId64 "," | |
| "\"scan_time_max\":%" PRId64 "," | |
| "\"scan_time_avg\":%" PRId64 "," | |
| "\"cycle_time_min\":%" PRId64 "," | |
| "\"cycle_time_max\":%" PRId64 "," | |
| "\"cycle_time_avg\":%" PRId64 "," | |
| "\"cycle_latency_min\":%" PRId64 "," | |
| "\"cycle_latency_max\":%" PRId64 "," | |
| "\"cycle_latency_avg\":%" PRId64 "," | |
| "\"overruns\":%" PRId64 "}\n", | |
| snapshot.scan_count, snapshot.scan_time_min, snapshot.scan_time_max, | |
| snapshot.scan_time_avg, snapshot.cycle_time_min, snapshot.cycle_time_max, | |
| snapshot.cycle_time_avg, snapshot.cycle_latency_min, snapshot.cycle_latency_max, | |
| snapshot.cycle_latency_avg, snapshot.overruns); | |
| int written = snprintf(buffer, buffer_size, | |
| "STATS:{" | |
| "\"scan_count\":0," | |
| "\"scan_time_min\":null," | |
| "\"scan_time_max\":null," | |
| "\"scan_time_avg\":null," | |
| "\"cycle_time_min\":null," | |
| "\"cycle_time_max\":null," | |
| "\"cycle_time_avg\":null," | |
| "\"cycle_latency_min\":null," | |
| "\"cycle_latency_max\":null," | |
| "\"cycle_latency_avg\":null," | |
| "\"overruns\":0" | |
| "}\n"); | |
| if (written < 0) | |
| { | |
| return written; | |
| } | |
| if ((size_t)written >= buffer_size) | |
| { | |
| /* Output was truncated */ | |
| return -1; | |
| } | |
| return written; | |
| } | |
| int written = snprintf(buffer, buffer_size, | |
| "STATS:{" | |
| "\"scan_count\":%" PRId64 "," | |
| "\"scan_time_min\":%" PRId64 "," | |
| "\"scan_time_max\":%" PRId64 "," | |
| "\"scan_time_avg\":%" PRId64 "," | |
| "\"cycle_time_min\":%" PRId64 "," | |
| "\"cycle_time_max\":%" PRId64 "," | |
| "\"cycle_time_avg\":%" PRId64 "," | |
| "\"cycle_latency_min\":%" PRId64 "," | |
| "\"cycle_latency_max\":%" PRId64 "," | |
| "\"cycle_latency_avg\":%" PRId64 "," | |
| "\"overruns\":%" PRId64 "}\n", | |
| snapshot.scan_count, snapshot.scan_time_min, snapshot.scan_time_max, | |
| snapshot.scan_time_avg, snapshot.cycle_time_min, snapshot.cycle_time_max, | |
| snapshot.cycle_time_avg, snapshot.cycle_latency_min, snapshot.cycle_latency_max, | |
| snapshot.cycle_latency_avg, snapshot.overruns); | |
| if (written < 0) | |
| { | |
| return written; | |
| } | |
| if ((size_t)written >= buffer_size) | |
| { | |
| /* Output was truncated */ | |
| return -1; | |
| } | |
| return written; |
| bool get_timing_stats_snapshot(plc_timing_stats_t *snapshot) | ||
| { | ||
| if (snapshot == NULL) | ||
| { | ||
| return false; | ||
| } |
There was a problem hiding this comment.
The function returns false when snapshot is NULL, but this condition is never checked by the caller in format_timing_stats_response at line 126. Consider documenting that NULL should never be passed, or add a check in the caller.
Move timing stats from logs to status response
Summary
This PR removes the periodic stats printing thread that was polluting logs every 5 seconds and instead makes timing statistics available through the
/api/statusendpoint. The change maintains backward compatibility by keeping the existingstatusfield unchanged while adding a newtiming_statsobject.Key changes:
print_stats_threadfromplc_main.cthat logged stats every 5 secondsplc_timing_statsaccess (critical for 32-bit ARM platforms where 64-bit reads can tear)STATSunix socket command that returns timing stats as JSON/api/statusresponse to includetiming_statsalongside the existingstatusfieldstop_plc()exception handlerPRId64format specifiers from<inttypes.h>for cross-platform int64_t formattingSample response from
/api/status:{ "status": "STATUS:RUNNING", "timing_stats": { "scan_count": 12345, "scan_time_min": 10, "scan_time_max": 150, "scan_time_avg": 45, "cycle_time_min": 1000, "cycle_time_max": 1200, "cycle_time_avg": 1050, "cycle_latency_min": -5, "cycle_latency_max": 200, "cycle_latency_avg": 50, "overruns": 3 } }When no scan cycles have run yet, min/max/avg values will be
null.Updates since last revision
%ldformat specifiers with portablePRId64macro from<inttypes.h>for int64_t typesReview & Testing Checklist for Human
scan_cycle_manager.cdoesn't introduce latency issues in the real-time PLC cycle thread. The mutex is held during stats updates which should be fast, but worth verifying on actual hardware.body.statusstill work correctly (they should ignore the newtiming_statsfield)timing_statsJSON structure matches what the editor team expects to consumeRecommended test plan:
/api/statusand verify timing stats are populatedNotes
/api/statusendpoint returns the expected response withtiming_stats.pre-commit-config.yamlchange adds R0902 (too-many-instance-attributes) and R1732 (consider-using-with) to disabled pylint warnings - these are pre-existing issues in the codebase, not introduced by this PRLink to Devin run: https://app.devin.ai/sessions/eb490f1d8d8c43ef9c9619fd12c3eed0
Requested by: Thiago Alves (thiago.alves@autonomylogic.com) / @thiagoralves