-
Notifications
You must be signed in to change notification settings - Fork 32
Improves test throughput, especially for threaded tests #1749
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
…M_TEST_NUM_OMP_THREAD Sets the default to 4 when using OpenMP; 0 otherwise. This will be used for OpenMP-based tests.
Uses default number of threads per job via AXOM_TEST_NUM_OMP_THREADS cache variable.
This speeds up unit test time in debug builds by a few minutes.
| # sets the number of threads for the omp policy; leave empty for non-omp policies | ||
| set(_num_threads) | ||
| if(_pol STREQUAL "omp") | ||
| set(_num_threads ${AXOM_TEST_NUM_OMP_THREADS}) | ||
| endif() | ||
|
|
||
| axom_add_test( | ||
| NAME "core_array_perf_1d_${_pol}" | ||
| COMMAND core_array_perf_ex --policy ${_pol} --shape 1200000 -r 10) | ||
| NAME core_array_perf_1d_${_pol} | ||
| COMMAND core_array_perf_ex --policy ${_pol} --shape 1200000 -r 10 | ||
| NUM_OMP_THREADS ${_num_threads}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: This PR relies on allowing empty arguments to NUM_OMP_THREADS, in which case if(arg_NUM_OMP_THREADS) will evaluate to false
| if(AXOM_ENABLE_MPI) | ||
| axom_add_test(NAME ${_test_name} | ||
| COMMAND utils_annotations_test -m ${_mode} | ||
| NUM_MPI_TASKS 2) | ||
| else() | ||
| axom_add_test(NAME ${_test_name} | ||
| COMMAND utils_annotations_test -m ${_mode}) | ||
| set(_num_ranks 2) | ||
| endif() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: A recent BLT PR allows for NUM_MPI_TASKS to be 1 in non-mpi configs.
This allows us to simplify some test setup logic.
| --meshType bpSidre | ||
| inline_mesh --min -2 -2 -2 --max 2 2 2 --res 30 30 30 | ||
| NUM_MPI_TASKS ${_nranks}) | ||
| inline_mesh --min -2 -2 -2 --max 2 2 2 --res 10 10 10 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gunney1 -- in a debug config, these tests were each taking ~100 seconds at a resolution of
| #------------------------------------------------------------------------------ | ||
| # Test execution controls | ||
| #------------------------------------------------------------------------------ | ||
| if(AXOM_ENABLE_OPENMP) | ||
| set(AXOM_TEST_NUM_OMP_THREADS 4 CACHE STRING "Default number of OpenMP threads for tests") | ||
| else() | ||
| set(AXOM_TEST_NUM_OMP_THREADS 0 CACHE STRING "Default number of OpenMP threads for tests") | ||
| endif() | ||
| mark_as_advanced(AXOM_TEST_NUM_OMP_THREADS) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: I added a new config variable to control the default number of threads per test. Users/Devs can mostly ignore it.
| set(_numOMPThreads 8) # limit omp parallelism | ||
| set(_policies seq) | ||
| set(_policies "seq") | ||
| if(RAJA_FOUND) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Longer term, what about putting this conditional logic in a function that returns the supported policies? It must occur in multiple places.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea -- I was thinking about something like that.
Maybe in a follow up PR so it's easier to distinguish from this set of changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update: #1750
BradWhitlock
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm looking forward to the testing speedups!
| or when `components="all"`), all components are enabled. All components can be disabled via `components=none`. | ||
| - Adds a `conduit` spack variant. Conduit was previously a required dependency in our spack package, and is now enabled by default. | ||
| - Adds spack variants for `adiak` and `caliper`. These replace the previous `profiling` variant which enabled both at the same time. | ||
| - Adds the `AXOM_TEST_NUM_OMP_THREADS` configuration variable to control the default OpenMP thread count for tests. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| - Adds the `AXOM_TEST_NUM_OMP_THREADS` configuration variable to control the default OpenMP thread count for tests. | |
| - Adds the `AXOM_TEST_OMP_NUM_THREADS` configuration variable to control the default OpenMP thread count for tests. |
Minor nit. I think it's better to be consistent with the OMP_NUM_THREADS env variable or OpenMP library function with a similar name. Here and throughout.
Also, the terms ranks and tasks are used interchangeably as they relate to MPI. Should that be made consistent also?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for consistency, but it might be tricky here.
The keywords/parameters for blt_add_test are NUM_MPI_TASKS and NUM_OMP_THREADS, while the ENV variable for OpenMP is OMP_NUM_THREADS. It's not clear that renaming the blt parameters would be appropriate or an improvement, and would affect all blt users.
What do you suggest?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aargh--- naming is important and consistency is a desirable quality. I would say, let's match blt_add_test, and in any discussion let's point out the discrepancy to OpenMP environment variables.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably best to discuss with Chris and others. I would defer to OpenMP which is a common, well-known standard, whereas BLT it not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be confusing to switch it. This is an existing macro that's been in the code for a long time and is entirely Axom developer focused.
We're calling axom_add_test() which directly calls blt_add_test() and all other keywords/parameters are the same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My vote is to match BLT since it is influencing a BLT macro. The reason it is named that way is because it follows the pattern of the BLT MPI variable which was created long before the OpenMP parameter which was before I knew that OMP_NUM_THREADS environment variable existed.
a1a4fd1 to
db0ae2b
Compare
| blt_add_test(NAME ${arg_NAME} | ||
| COMMAND ${arg_COMMAND} | ||
| NUM_MPI_TASKS ${arg_NUM_MPI_TASKS} | ||
| NUM_OMP_THREADS ${arg_NUM_OMP_THREADS} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I recommend that we don't change the names of the args to axom_add_test, and that in the documentation comment for this function, that you please add something like "The OpenMP environment variable is OMP_NUM_THREADS, but BLT uses NUM_MPI_TASKS and NUM_OMP_THREADS. We are electing to match BLT."
Arlie-Capps
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the rework, Kenny. As a style recommendation, I would say don't change the names of arguments to axom_add_test. But whatever you choose, please do add a bit to the documentation comment for that function, pointing out the mismatch to OpenMP environment variables.
Summary
This PR focuses on our test throughput, especially in configurations with OpenMP enabled.
OMP_NUM_THREADSenvironment variable, and were blocking all other tests leading to excessive test times.in excess of 300 seconds on a beefy node. After this PR, it's down to about 80 secondsNevertheless, this branch does improve our test times by about 20%
C2CandMFEMfor my initial testing, which removed a bunch of the more expensive tests.NUM_MPI_TASKSandNUM_OMP_THREADSparameters toaxom_add_test, which forwards them toblt_add_testPROCESSORStest property. Adds PROCESSORS property to tests based on number of ranks and threads blt#749AXOM_TEST_NUM_OMP_THREADSconfig variable to provide a single, configurable default thread count for OpenMP tests.blt_add_test()call sites toaxom_add_test()and removes ad-hoc per-testPROCESSORS/OMP_NUM_THREADShandling in favor of BLT’s centralized behavior.30^3to10^3since the former were taking over 100 seconds each in debug configs when using the ctestPROCESSORSproperty.Test throughput improvements
Comparing test runtimes on an exclusively allocated toss4 node on LC using
ctest -j100Notes:
PROCESSORStest property is actually slowing us down by about 3%, and we should consider other approaches to increasing our test throughput, e.g. CMake's resource spec and resource groups -- https://cmake.org/cmake/help/latest/prop_test/RESOURCE_GROUPS.htmlRESOURCE_LOCKserializes invocation of the MPI tests to better see how long each test takes.