fix(windows): guard Linux-only POSIX RT calls so the installer build compiles under MSYS2#133
Merged
Merged
Conversation
Pre-release tags (e.g. v4.1.0-rc.1) should not overwrite the latest tag in ghcr. Only stable releases (no hyphen suffix) get latest. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Pre-release tags (e.g. v4.1.0-rc.1) should only build Docker images, not Windows installers. Manual workflow_dispatch still works. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Release v4.1.1 — STruC++ compiler + VPP support + compile CPU cap
v4.1.1's Windows installer build failed at `core/src/plc_app/image_tables.cpp:96` with: error: 'PTHREAD_PRIO_INHERIT' was not declared in this scope `PTHREAD_PRIO_INHERIT` is the POSIX priority-inheritance mutex protocol — optional in the standard. MSYS2/MinGW pthread on Windows doesn't ship it (`_POSIX_THREAD_PRIO_INHERIT` is undefined and `pthread_mutexattr_setprotocol` isn't declared). Priority inheritance is only meaningful on a real-time-scheduled system (SCHED_FIFO threads); Windows has no RT scheduler, so the protocol would be a no-op even if it linked. Same broader pattern applies to the CPU-affinity block in `plc_state_manager.cpp::plc_task_thread`: `cpu_set_t` / `CPU_ZERO` / `CPU_SETSIZE` / `CPU_SET` / `pthread_setaffinity_np` are Linux `<sched.h>` extensions, not part of standard POSIX, and MSYS2's pthread headers don't define them. Fix: guard both blocks with `#if !defined(__CYGWIN__) && !defined(__MSYS__)` (plus `_POSIX_THREAD_PRIO_INHERIT` for the PI call and `__linux__` for the CPU-affinity block). Windows degrades to a plain recursive mutex + default scheduling — consistent with the platform's lack of real-time support. The existing `pthread_setschedparam(SCHED_FIFO)` call in the same function is already runtime-graceful (logs a warning on EPERM and falls through), so it doesn't need a compile-time guard; MSYS2's pthread.h declares the symbol but the kernel rejects the request at runtime, which is the behaviour we already handle. Reference: the same `#if !defined(__CYGWIN__) && !defined(__MSYS__)` pattern already exists in `core/src/plc_app/utils/utils.c` (gating `<sched.h>` / `<sys/mman.h>` / `<sys/prctl.h>` includes via the local `HAS_REALTIME_FEATURES` macro), so this matches an established in-tree convention rather than introducing a new one. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
After fixing the POSIX PI-mutex / CPU-affinity guards in
image_tables.cpp + plc_state_manager.cpp, the Windows installer
build progressed past the runtime and stopped on the EtherCAT
plugin instead:
/usr/include/w32api/winsock2.h:1031:34: error: conflicting types
for 'select'; have 'int(int, fd_set *, fd_set *, fd_set *,
const TIMEVAL *)'
/usr/include/w32api/winsock2.h:1040:34: error: conflicting types
for 'gethostname'; have 'int(char *, int)'
Root cause: the vendored SOEM (Simple Open EtherCAT Master) ships
its own Windows port under
`core/src/drivers/plugins/native/ethercat/libs/soem/oshw/win32/wpcap/`
that pulls in WinPcap headers, which collide with MSYS2's w32api
`winsock2.h` (duplicate `select` / `gethostname` declarations).
Even if those headers compiled cleanly the EtherCAT plugin itself
relies on SCHED_FIFO + PREEMPT_RT scheduling that Windows doesn't
provide — it's fundamentally Linux-only by design. No point
trying to ship a non-functional Windows EtherCAT plugin.
This patch teaches `build_native_plugins()` in install.sh to skip
plugins listed in a `linux_only_plugins=("ethercat")` array when
running under MSYS2. Linux/macOS builds are unchanged; Windows
gets a clean build that excludes the plugin entirely and logs
"Skipping Linux-only plugin on MSYS2/Windows: ethercat".
The array makes future Linux-only plugins (eg. xenomai, custom
PREEMPT_RT-dependent drivers) trivial to add.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This reverts commit 8c02488.
The vendored SOEM library and its wpcap include path were already
wired up for MSYS2 in the EtherCAT CMakeLists.txt (auto-generates a
`MSYS.cmake` for SOEM that defines `WIN32` and adds the wpcap
include directory). SOEM's own .c files compile fine — but
`ethercat_plugin.c` and its siblings fail with:
error: conflicting types for 'select'; have
'int(int, fd_set *, fd_set *, fd_set *, const TIMEVAL *)'
note: previous declaration of 'select' with type
'int(int, fd_set *, fd_set *, fd_set *, struct timeval *)'
Root cause: `ethercat_plugin.c` includes both `<signal.h>` (POSIX
chain → `sys/select.h` → POSIX `select` with `struct timeval *`)
AND `soem/soem.h` (→ `nicdrv.h` win32 → `pcap.h` → `pcap-stdinc.h`
→ `winsock2.h` → Win32 `select` with `const TIMEVAL *`). MSYS2's
two libc's disagree on the signature so we get a redeclaration
error. Same story for `gethostname` (`size_t` vs `int`).
MSYS2's `sys/select.h` is explicit about this:
/* We don't define fd_set and friends if we are compiling POSIX
... (defined by __USE_W32_SOCKETS) the W32api winsock[2].h
header which it must not call the Cygwin select function. */
# if !(defined (_WINSOCK_H) || defined (_WINSOCKAPI_) || \
defined (__USE_W32_SOCKETS))
Same guard exists in `sys/unistd.h` around `gethostname`.
Fix: add `__USE_W32_SOCKETS` to SOEM's `target_compile_definitions
(PUBLIC ...)` block. The PUBLIC scope propagates the define to
every target linking against soem — including `ethercat_plugin` —
so the POSIX chain's `select` / `gethostname` decls get suppressed
in those TUs and the Win32 wpcap-supplied decls win without
conflict. No effect on Linux builds (the whole block is gated by
`if(CMAKE_SYSTEM_NAME STREQUAL \"MSYS\" OR ... \"CYGWIN\")`).
Verified by reproducing the conflict in a Windows VM with the same
MSYS2 toolchain (gcc 15.2.0, w32api 14.0.0) and watching the
EtherCAT plugin build go from \"conflicting types for 'select'\" to
clean: \`libethercat_plugin.so\` artefact produced.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
4 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
v4.1.1's Windows installer build failed at `core/src/plc_app/image_tables.cpp:96`:
```
error: 'PTHREAD_PRIO_INHERIT' was not declared in this scope; did you mean 'PTHREAD_ONCE_INIT'?
```
`PTHREAD_PRIO_INHERIT` is the POSIX priority-inheritance mutex protocol — optional in the standard. MSYS2/MinGW pthread on Windows doesn't ship it (`_POSIX_THREAD_PRIO_INHERIT` is undefined, `pthread_mutexattr_setprotocol` isn't declared). And it's a no-op even when it works without an RT scheduler — Windows has no SCHED_FIFO, so there's nothing to priority-boost into.
Same pattern hits `plc_state_manager.cpp`'s CPU-affinity block: `cpu_set_t` / `CPU_ZERO` / `CPU_SETSIZE` / `pthread_setaffinity_np` are Linux `<sched.h>` extensions, not standard POSIX. MSYS2 doesn't define them. Build never reaches this file because it stops at `image_tables.cpp` first, but the next compile would have hit it.
What changed
Two surgical `#if` guards:
The existing `pthread_setschedparam(SCHED_FIFO)` call in the same function is already runtime-graceful (logs a warning on EPERM, falls through) so it needs no compile-time guard — MSYS2 declares the symbol; the kernel rejects the request at runtime.
The `#if !defined(CYGWIN) && !defined(MSYS)` pattern matches the existing `HAS_REALTIME_FEATURES` macro in `core/src/plc_app/utils/utils.c` — established convention in-tree.
Why didn't earlier RC builds fail this way
`v4.1.0-rc.1` (April 3) was the last Windows-installer build that ran successfully. rc.2 / rc.3 / rc.4 all had the job skipped. The Phase 5 STruC++ shim work added `image_tables.cpp` (and its sibling `plc_state_manager.cpp`) after rc.1, so this file path is brand new on Windows.
Test plan