Skip to content

Instability in iOS tests under CI conditions #2335

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
freakboy3742 opened this issue Mar 25, 2025 · 7 comments · Fixed by #2337
Closed

Instability in iOS tests under CI conditions #2335

freakboy3742 opened this issue Mar 25, 2025 · 7 comments · Fixed by #2337

Comments

@freakboy3742
Copy link
Contributor

Reported by @henryiii in this comment.

It appears that iOS builds are seeing unusually high failure rates on CI jobs - possibly as high as 50% of CI jobs. It's hard to get precise metrics out of the tools Github provides, but there's no shortage of failures on the Test workflow for macOS runs - usually macOS-14, but sometimes macOS-13 (x86_64).

Digging into the logs, the failures all seem to be timeouts:

  • Hitting the 5 minute timeout for running a build on the simulator
  • Hitting a 10 second timeout waiting for the return value of the simulator list tool.

I'm not seeing this failure rate locally - but local conditions will always be more favourable for test timing, as they can involve partially warmed simulator images, and won't usually be competing with other build tasks on the same physical hardware.

@freakboy3742
Copy link
Contributor Author

Anecdotally, there also seems to be highly variable performance - especially I/O performance - on GitHub Actions macOS runners. Toga doesn't use cibuildwheel to run tests, but it does start an iOS simulator as part of tests. Sometimes the simulator can start almost immediately; and sometimes it can take a couple of minutes. My current hypothesis is that the 5 minute timeout is a little too aggressive for the "worst case" CI performance.

Unfortunately, there's no real way to replicate this other than "start a CI jobs"... and the fix will likely require an updated support package. So - apologies in advance for the thrashing I'm about to give CI. I'll try to minimise the impact to just macOS jobs.

@freakboy3742
Copy link
Contributor Author

This also seems to be a problem that is specific to Github CI. As far as I can make out from CI logs, CircleCI and Azure Pipeline tests are passing consistently.

@joerick
Copy link
Contributor

joerick commented Mar 25, 2025

I have never run the iOS tests locally, so I tried running them just now.

The first run took a pretty long time, pytest tells me

3 passed in 507.62s (0:08:27)

Activity monitor showed lots of CPU usage in that time from various processes, including /Library/Developer/CoreSimulator/Volumes/iOS_21C62/Library/Developer/CoreSimulator/Profiles/Runtimes/iOS\ 17.2.simruntime/Contents/Resources/update_dyld_sim_shared_cache. I also got a crash report for MobileCal in that time, though it didn't appear to affect anything.

The second run of pytest took:

3 passed in 122.54s (0:02:02)

So, much faster. I wonder if there's a way to warm the Simulator before starting the iOS tests?

@freakboy3742
Copy link
Contributor Author

So, much faster. I wonder if there's a way to warm the Simulator before starting the iOS tests?

It's certainly worth a try. I've incorporated that into #2336 - However, as an indication of the sort of pathological performance degradation that is occurring, this build which pre-warms the simulator took almost 2 minutes to run xcrun simctl boot "iPhone SE (3rd generation)" on the macOS-14 runner - a command that takes 2 seconds to run on my M1 MacBook, and 10s on the macOS-13 runner.

@freakboy3742
Copy link
Contributor Author

A minor victory: by chance, I've been able to reproduce the issue on my own laptop.

The reproduction case: run a CPU intensive job on the same machine. I was running some video encoding in the background, and the compilation step timed out.

So - I wonder if the issue might be the multi-threaded nature of the integration test suite. Xcodebuild will be a multithreaded build; running multiple parallel compilation passes, and also starting Xcode and the simulator, might be just a little too much for the Github macOS runners to handle.

Another thought is that the runner is on macOS 14, which defaults to Xcode 15.4. Toga saw some performance issues with some of the Xcode 15 releases; trying a different Xcode release might help.

@freakboy3742
Copy link
Contributor Author

End-of-day progress update:

Ensuring the iOS tests aren't just running on a single worker, but are the only tests that are running at the time seems to be the magic trick. Tests using --num-processes=1 have succeeded a couple of times; the most recent builds on #2336 use a pytest.mark to identify a "serial" group of tests that must be run without xdist, which has the same net effect, but without the performance penalty for the remaining 95% of tests that can co-exist in parallel.

I'm also switching the Xcode version being used to 16.2; I don't know if that's helping, but it definitely won't be hurting. The other (and possibly better long term) option would be to switch to the macos-15 runner - it's technically still in beta, but it uses Xcode 16.2 by default, and presumably it's not far of being the new macos-latest runner, given that macOS-16 is likely 3 months from being announced, and 6 months from release.

I'll run CI a few more times tomorrow so I can confirm this isn't just the favourable AU timezone revealing itself again; if I have continued success, I'll formalize the change as a "non-test" PR tomorrow.

@freakboy3742
Copy link
Contributor Author

Extra testing in the morning seems to be stable; #2337 is the final form of the fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants