|
| 1 | +# Changelog |
| 2 | + |
| 3 | +All notable changes to the IPC Benchmark Suite will be documented in this file. |
| 4 | + |
| 5 | +The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/), |
| 6 | +and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). |
| 7 | + |
| 8 | +## [Unreleased] |
| 9 | + |
| 10 | +### Added - PR #93: Blocking Mode & Direct Memory SHM |
| 11 | + |
| 12 | +#### Major Features |
| 13 | +- **Blocking execution mode** - Complete synchronous I/O implementation for all IPC mechanisms |
| 14 | + - Uses pure `std` library (no Tokio in blocking mode) |
| 15 | + - Add `--blocking` flag to enable blocking mode |
| 16 | + - Maintains backward compatibility (async is default) |
| 17 | + - Both modes coexist in same binary |
| 18 | + |
| 19 | +- **Direct memory shared memory implementation** - High-performance C-style SHM |
| 20 | + - Add `--shm-direct` flag for blocking SHM mode |
| 21 | + - Zero serialization overhead (direct memcpy with `#[repr(C)]`) |
| 22 | + - pthread mutex + condition variable synchronization |
| 23 | + - 3× faster average latency (7 μs vs 20 μs) |
| 24 | + - 450× better max latency (22 μs vs 10 ms) |
| 25 | + - Matches C benchmark performance |
| 26 | + |
| 27 | +#### Blocking Transport Implementations |
| 28 | +- `BlockingUnixDomainSocket` - Blocking UDS transport |
| 29 | +- `BlockingTcpSocket` - Blocking TCP transport |
| 30 | +- `BlockingSharedMemory` - Blocking ring buffer SHM transport |
| 31 | +- `BlockingSharedMemoryDirect` - Direct memory SHM transport |
| 32 | +- `BlockingPosixMessageQueue` - Blocking PMQ transport (Linux only) |
| 33 | + |
| 34 | +#### Infrastructure |
| 35 | +- `BlockingTransport` trait - Common interface for blocking transports |
| 36 | +- `BlockingTransportFactory` - Factory for creating blocking transports |
| 37 | +- `BlockingBenchmarkRunner` - Benchmark orchestration for blocking mode |
| 38 | +- `BlockingResultsManager` - Results collection for blocking mode |
| 39 | +- `ExecutionMode` enum - Runtime mode selection |
| 40 | + |
| 41 | +#### Documentation |
| 42 | +- Added blocking mode section to README.md |
| 43 | +- New METHODOLOGY_CHANGE.md - Timestamp methodology details |
| 44 | +- New SHM_ANALYSIS.md - Performance analysis |
| 45 | +- New TEST_REPORT.md - Test results |
| 46 | +- New AGENTS.md - AI agent guidelines |
| 47 | +- New examples/blocking_basic.rs - Basic usage example |
| 48 | +- New examples/blocking_comparison.rs - Async vs blocking comparison |
| 49 | + |
| 50 | +#### Testing |
| 51 | +- 40 passing doctests |
| 52 | +- 24+ integration tests for blocking mode |
| 53 | +- Full coverage of all blocking transports |
| 54 | +- Advanced scenario testing |
| 55 | + |
| 56 | +### Fixed - PR #93 |
| 57 | + |
| 58 | +- **Critical: Warmup reporting bug** - JSON output now correctly reports warmup iterations |
| 59 | + - `warmup_iterations` was hardcoded to 0, now uses actual value |
| 60 | + |
| 61 | +- **Critical: Cold-start penalty** - Eliminated 308× first-message latency spike |
| 62 | + - Implemented canary message system (ID = u64::MAX) |
| 63 | + - First message excluded from results by default |
| 64 | + - Max latency improved from 9.98 ms to 32 μs for SHM |
| 65 | + - Add `--include-first-message` flag to include first message if needed |
| 66 | + |
| 67 | +- **Critical: Doctest compilation failures** - Fixed 3 doctests with outdated signatures |
| 68 | + - Updated `BlockingTransportFactory::create()` examples |
| 69 | + - Added missing Args struct fields in benchmark.rs |
| 70 | + |
| 71 | +- **MessageType serialization** - Added `From<u32>` trait for proper deserialization |
| 72 | + |
| 73 | +- **Flag validation** - `--shm-direct` now requires `--blocking` with clear error message |
| 74 | + |
| 75 | +- **Trailing whitespace** - Fixed formatting issues in benchmark.rs and benchmark_blocking.rs |
| 76 | + |
| 77 | +### Changed - PR #93 |
| 78 | + |
| 79 | +- **Timestamp methodology** - Now matches C benchmark approach |
| 80 | + - Timestamps captured immediately before IPC syscalls |
| 81 | + - Pre-serialize messages to avoid timing serialization overhead |
| 82 | + - Ensures scheduling delays are included in measured latency |
| 83 | + |
| 84 | +- **Performance** - Rust now matches or exceeds C performance |
| 85 | + - PMQ: 1.02× C (essentially equal) |
| 86 | + - SHM Min: 0.41× C (59% faster!) |
| 87 | + - UDS Avg: 0.79× C (21% faster!) |
| 88 | + - UDS Max: 0.63× C (37% faster!) |
| 89 | + |
| 90 | +### Performance Comparison |
| 91 | + |
| 92 | +#### Direct Memory SHM vs Ring Buffer SHM |
| 93 | + |
| 94 | +| Metric | Ring Buffer | Direct Memory | Improvement | |
| 95 | +|--------|-------------|---------------|-------------| |
| 96 | +| Mean | 22.51 μs | 7.42 μs | **3.0× faster** | |
| 97 | +| Min | 5.73 μs | 5.00 μs | 1.15× faster | |
| 98 | +| Max | 9,983 μs | 22.18 μs | **450× faster** | |
| 99 | + |
| 100 | +#### Rust vs C Benchmarks (Blocking Mode) |
| 101 | + |
| 102 | +| Test | Metric | C (ns) | Rust (ns) | Rust/C Ratio | |
| 103 | +|------|--------|--------|-----------|--------------| |
| 104 | +| PMQ NoLoad | Avg | 8,498 | 8,635 | 1.02× (equal) | |
| 105 | +| SHM Load | Avg | 95,094 | 95,719 | 1.01× (equal) | |
| 106 | +| SHM Load | Min | 5,729 | 2,344 | **0.41× (59% faster)** | |
| 107 | +| UDS NoLoad | Avg | 18,445 | 14,502 | **0.79× (21% faster)** | |
| 108 | +| UDS NoLoad | Max | 81,042 | 50,781 | **0.63× (37% faster)** | |
| 109 | + |
| 110 | +### Technical Details |
| 111 | + |
| 112 | +#### Code Statistics |
| 113 | +- **12,306 lines added** across 38 files |
| 114 | +- **5,305 lines** of blocking implementation code |
| 115 | +- **1,642 lines** of documentation |
| 116 | +- **30+ commits** with clear progression |
| 117 | + |
| 118 | +#### Quality Metrics |
| 119 | +- ✅ 68 tests passing (40 doctests + 28 integration tests) |
| 120 | +- ✅ Zero clippy warnings |
| 121 | +- ✅ Zero compiler warnings |
| 122 | +- ✅ All unsafe code documented with SAFETY comments |
| 123 | +- ✅ Comprehensive error handling with context |
| 124 | +- ✅ Full backward compatibility maintained |
| 125 | + |
| 126 | +## [0.1.0] - Initial Release |
| 127 | + |
| 128 | +### Added |
| 129 | +- Initial async implementation with Tokio runtime |
| 130 | +- Unix Domain Sockets (UDS) transport |
| 131 | +- TCP Sockets transport |
| 132 | +- Shared Memory (ring buffer) transport |
| 133 | +- POSIX Message Queues transport (Linux only) |
| 134 | +- Comprehensive metrics collection with HDR histograms |
| 135 | +- JSON and CSV output formats |
| 136 | +- Streaming output support |
| 137 | +- CPU affinity controls |
| 138 | +- Warmup iterations support |
| 139 | + |
| 140 | +--- |
| 141 | + |
| 142 | +## Legend |
| 143 | + |
| 144 | +- **Added** - New features |
| 145 | +- **Changed** - Changes to existing functionality |
| 146 | +- **Deprecated** - Soon-to-be removed features |
| 147 | +- **Removed** - Removed features |
| 148 | +- **Fixed** - Bug fixes |
| 149 | +- **Security** - Vulnerability fixes |
| 150 | + |
| 151 | +[Unreleased]: https://github.com/redhat-performance/rusty-comms/compare/main...feature/shm-direct-memory |
| 152 | +[0.1.0]: https://github.com/redhat-performance/rusty-comms/releases/tag/v0.1.0 |
| 153 | + |
0 commit comments