Skip to content

Conversation

@kennyweiss
Copy link
Member

@kennyweiss kennyweiss commented Dec 22, 2025

Summary

This PR focuses on our test throughput, especially in configurations with OpenMP enabled.

  • The threaded tests were not all using the OMP_NUM_THREADS environment variable, and were blocking all other tests leading to excessive test times. in excess of 300 seconds on a beefy node. After this PR, it's down to about 80 seconds
    • Update: It seems that I overstate the benefits of adding the PROCESSORS tag. It might even slow our tests down.
      Nevertheless, this branch does improve our test times by about 20%
    • It seems that I inadvertently disabled C2C and MFEM for my initial testing, which removed a bunch of the more expensive tests.
  • It also improves the consistency of how we're setting up the number of MPI ranks and OpenMP threads, via the NUM_MPI_TASKS and NUM_OMP_THREADS parameters to axom_add_test, which forwards them to blt_add_test
  • It relies on a concurrent PR to blt which properly marks unit test with the number of requested processors using the CMake PROCESSORS test property. Adds PROCESSORS property to tests based on number of ranks and threads blt#749
    • Update: Since the PROCESSORS property did not pan out (at least in its current state), I removed it from this PR.
  • Introduces AXOM_TEST_NUM_OMP_THREADS config variable to provide a single, configurable default thread count for OpenMP tests.
  • Converts multiple blt_add_test() call sites to axom_add_test() and removes ad-hoc per-test PROCESSORS/OMP_NUM_THREADS handling in favor of BLT’s centralized behavior.
  • Reduces the resolution of the quest_shaping_in_memory tests from 30^3 to 10^3 since the former were taking over 100 seconds each in debug configs when using the ctest PROCESSORS property.

Test throughput improvements

Comparing test runtimes on an exclusively allocated toss4 node on LC using ctest -j100

time branch uses PROCESSORS property uses RESOURCE_LOCK
349 seconds axom@develop
364 seconds axom@develop
356 seconds axom@develop
364 seconds axom@develop
296 seconds this branch
295 seconds this branch
305 seconds this branch
314 seconds this branch
--- ----- ---- ----
250 seconds this branch (w/ lower heroic_roses res)

Notes:

  • After running some experiments, it looks like adding the PROCESSORS test property is actually slowing us down by about 3%, and we should consider other approaches to increasing our test throughput, e.g. CMake's resource spec and resource groups -- https://cmake.org/cmake/help/latest/prop_test/RESOURCE_GROUPS.html
  • RESOURCE_LOCK serializes invocation of the MPI tests to better see how long each test takes.
  # Optionally, add RESOURCE_LOCK property to mpi tests for timing purposes
   if(arg_NUM_MPI_TASKS AND BLT_ENABLE_MPI)
       set_property(TEST ${arg_NAME}
                    APPEND PROPERTY RESOURCE_LOCK lock_name)
   endif()

@kennyweiss kennyweiss self-assigned this Dec 22, 2025
@kennyweiss kennyweiss added Testing Issues related to testing Axom maintenance Issues related to code maintenance mpi Related to MPI communication OpenMP Related to OpenMP support labels Dec 22, 2025
Comment on lines +81 to +90
# sets the number of threads for the omp policy; leave empty for non-omp policies
set(_num_threads)
if(_pol STREQUAL "omp")
set(_num_threads ${AXOM_TEST_NUM_OMP_THREADS})
endif()

axom_add_test(
NAME "core_array_perf_1d_${_pol}"
COMMAND core_array_perf_ex --policy ${_pol} --shape 1200000 -r 10)
NAME core_array_perf_1d_${_pol}
COMMAND core_array_perf_ex --policy ${_pol} --shape 1200000 -r 10
NUM_OMP_THREADS ${_num_threads})
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: This PR relies on allowing empty arguments to NUM_OMP_THREADS, in which case if(arg_NUM_OMP_THREADS) will evaluate to false

Comment on lines 186 to 193
if(AXOM_ENABLE_MPI)
axom_add_test(NAME ${_test_name}
COMMAND utils_annotations_test -m ${_mode}
NUM_MPI_TASKS 2)
else()
axom_add_test(NAME ${_test_name}
COMMAND utils_annotations_test -m ${_mode})
set(_num_ranks 2)
endif()
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: A recent BLT PR allows for NUM_MPI_TASKS to be 1 in non-mpi configs.
This allows us to simplify some test setup logic.

--meshType bpSidre
inline_mesh --min -2 -2 -2 --max 2 2 2 --res 30 30 30
NUM_MPI_TASKS ${_nranks})
inline_mesh --min -2 -2 -2 --max 2 2 2 --res 10 10 10
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gunney1 -- in a debug config, these tests were each taking ~100 seconds at a resolution of $30^3$. Changing it to $10^3$ reduces the runtime significantly to around 1-10 seconds per test.

Comment on lines +50 to +58
#------------------------------------------------------------------------------
# Test execution controls
#------------------------------------------------------------------------------
if(AXOM_ENABLE_OPENMP)
set(AXOM_TEST_NUM_OMP_THREADS 4 CACHE STRING "Default number of OpenMP threads for tests")
else()
set(AXOM_TEST_NUM_OMP_THREADS 0 CACHE STRING "Default number of OpenMP threads for tests")
endif()
mark_as_advanced(AXOM_TEST_NUM_OMP_THREADS)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: I added a new config variable to control the default number of threads per test. Users/Devs can mostly ignore it.

set(_numOMPThreads 8) # limit omp parallelism
set(_policies seq)
set(_policies "seq")
if(RAJA_FOUND)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Longer term, what about putting this conditional logic in a function that returns the supported policies? It must occur in multiple places.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea -- I was thinking about something like that.
Maybe in a follow up PR so it's easier to distinguish from this set of changes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update: #1750

Copy link
Member

@BradWhitlock BradWhitlock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm looking forward to the testing speedups!

or when `components="all"`), all components are enabled. All components can be disabled via `components=none`.
- Adds a `conduit` spack variant. Conduit was previously a required dependency in our spack package, and is now enabled by default.
- Adds spack variants for `adiak` and `caliper`. These replace the previous `profiling` variant which enabled both at the same time.
- Adds the `AXOM_TEST_NUM_OMP_THREADS` configuration variable to control the default OpenMP thread count for tests.
Copy link
Member

@rhornung67 rhornung67 Dec 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Adds the `AXOM_TEST_NUM_OMP_THREADS` configuration variable to control the default OpenMP thread count for tests.
- Adds the `AXOM_TEST_OMP_NUM_THREADS` configuration variable to control the default OpenMP thread count for tests.

Minor nit. I think it's better to be consistent with the OMP_NUM_THREADS env variable or OpenMP library function with a similar name. Here and throughout.

Also, the terms ranks and tasks are used interchangeably as they relate to MPI. Should that be made consistent also?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for consistency, but it might be tricky here.

The keywords/parameters for blt_add_test are NUM_MPI_TASKS and NUM_OMP_THREADS, while the ENV variable for OpenMP is OMP_NUM_THREADS. It's not clear that renaming the blt parameters would be appropriate or an improvement, and would affect all blt users.

What do you suggest?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aargh--- naming is important and consistency is a desirable quality. I would say, let's match blt_add_test, and in any discussion let's point out the discrepancy to OpenMP environment variables.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably best to discuss with Chris and others. I would defer to OpenMP which is a common, well-known standard, whereas BLT it not.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be confusing to switch it. This is an existing macro that's been in the code for a long time and is entirely Axom developer focused.

We're calling axom_add_test() which directly calls blt_add_test() and all other keywords/parameters are the same.

Copy link
Member

@white238 white238 Dec 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My vote is to match BLT since it is influencing a BLT macro. The reason it is named that way is because it follows the pattern of the BLT MPI variable which was created long before the OpenMP parameter which was before I knew that OMP_NUM_THREADS environment variable existed.

@kennyweiss kennyweiss force-pushed the feature/kweiss/processors branch from a1a4fd1 to db0ae2b Compare December 22, 2025 23:17
blt_add_test(NAME ${arg_NAME}
COMMAND ${arg_COMMAND}
NUM_MPI_TASKS ${arg_NUM_MPI_TASKS}
NUM_OMP_THREADS ${arg_NUM_OMP_THREADS}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend that we don't change the names of the args to axom_add_test, and that in the documentation comment for this function, that you please add something like "The OpenMP environment variable is OMP_NUM_THREADS, but BLT uses NUM_MPI_TASKS and NUM_OMP_THREADS. We are electing to match BLT."

Copy link
Contributor

@Arlie-Capps Arlie-Capps left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the rework, Kenny. As a style recommendation, I would say don't change the names of arguments to axom_add_test. But whatever you choose, please do add a bit to the documentation comment for that function, pointing out the mismatch to OpenMP environment variables.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

maintenance Issues related to code maintenance mpi Related to MPI communication OpenMP Related to OpenMP support Testing Issues related to testing Axom

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants