Add GPU benchmark stats guard #135

patdevinwilson · 2025-11-25T15:54:02Z

Add a guard in presto/scripts/run_benchmark.sh that detects when the presto-native-gpu stack is active and verifies all Hive .prestoSchema stats files exist for the chosen TPCH schema before running GPU benchmarks. If any stats file is missing, the script instructs the user to run presto-native-cpu and ANALYZE tables first.

misiugodfrey

My only open question is whether we should only require this for GPU clusters. Unless I'm missing something, I think we should require this for all benchmarks, regardless of variant, which also simplifies some of the code.

I also think we should add support to automatically run analyze_tables when we generate the benchmark as part of our setup_benchmark_data_and_tables script, but that work can be done separately if we agree it should move forward.

misiugodfrey · 2025-11-25T18:40:31Z

presto/scripts/run_benchmark.sh


 set -e

+TPCH_TABLES=(customer lineitem nation orders part partsupp region supplier)


Most of our scripts try to avoid hard-coding the table names - although it might add too much complexity to fetch them dynamically. If we expand this to more benchmarks then we may need to update this; but frankly I'm inclined to leave it for now to get simple verification going.

misiugodfrey · 2025-11-25T18:43:37Z

presto/scripts/run_benchmark.sh


+IS_GPU_CLUSTER=0
+if command -v docker >/dev/null 2>&1; then
+  if docker ps --format '{{.Names}}' 2>/dev/null | grep -q 'presto-native-worker-gpu'; then


Is there any reason to only require ANALYZE to have been run for gpu clusters? If we want an appropriate comparison, should we not check this for all clusters regardless of type?

If that's the case, then we can simplify here by removing all the gpu cluster checking and just always require ANALYZE to have been run.

misiugodfrey · 2025-11-25T18:48:32Z

presto/scripts/run_benchmark.sh

+    cat <<EOF
+
+Column statistics are required before running GPU benchmarks.
+Run presto-native-cpu and ANALYZE all tables in schema '$schema', then retry.


There is an analyze_tables.sh script in presto/scripts/ that we should point the user at here so they don't need to do the work themselves.

Frankly, I think we should automatically run the script when we generate tables, but that's work for a separate PR.

patdevinwilson · 2025-11-25T21:37:06Z

Let me know when I can close this PR.

Please review:
#136

I’ve reverted the previous stats-checking logic and repurposed the changes so the PR now ensures we automatically run ANALYZE right after tables are created by the setup scripts. This targets your earlier suggestion directly.

Less seems to be more this case.

mattgara · 2025-11-25T22:27:38Z

This PR is a great initiative. velox-testing could always use more guardrails and intuitive checks.

Please review: #136

I’ve reverted the previous stats-checking logic and repurposed the changes so the PR now ensures we automatically run ANALYZE right after tables are created by the setup scripts. This targets your earlier suggestion directly.

I think even if we implement an (optional or mandatory) call to analyze tables when setting up the benchmarks, it would still make sense to independently validate that the state we expect actually exists persistently (what this PR does/did originally.)

It probably makes sense to do both :)

misiugodfrey · 2025-11-25T23:02:43Z

Let me know when I can close this PR.

I think it's a good idea to commit this PR even with #136 going in as well. It's good to have a check in place to make sure things were set up correctly, as the data won't always be set up using the channels we provide.

I think all this needs right now is to remove the IS_GPU_CLUSTER checking and it's good to go.

paul-aiyedun · 2025-11-25T23:44:25Z

Does #136 avoid the need for this PR?

misiugodfrey · 2025-11-25T23:55:23Z

Does #136 avoid the need for this PR?

I think they are both useful. 136 makes sure we are set up when using the default path, and this one verifies that it was used. There are still cases where we could have set up differently and then this PR will catch validation issues.

paul-aiyedun · 2025-11-26T00:13:48Z

presto/scripts/run_benchmark.sh

 set -e

+TPCH_TABLES=(customer lineitem nation orders part partsupp region supplier)
+HIVE_METASTORE_DIR="$(readlink -f ../docker/.hive_metastore)"


Should we be using something like the SHOW STATS command instead of going through Hive Metastore files?

Should we be using something like the SHOW STATS command instead of going through Hive Metastore files?

Doing that up front for all TPCH tables adds noticeable overhead before every benchmark run. The command only works per table (and sometimes per column), so we’d need to run a Presto query for every table/column we care about. Some Presto versions return empty stats rows even when analysis hasn’t been done, so we’d still need to cross-check other metadata to be sure. That can get flaky when users customize their catalogs. Inspecting the .prestoSchema files in the Hive metastore gives us a fast, deterministic check if the JSON file exists and its columnStatistics block is non-empty, we know ANALYZE has been run at least once for that table. It’s essentially the ground truth that SHOW STATS would read anyway.

Is there a specification for the prestoSchema file, or is this relying on an implementation detail?

patdevinwilson force-pushed the checker-column-stats branch from aa253b2 to 0ac467d Compare November 25, 2025 15:57

Add GPU benchmark stats guard

0ac467d

misiugodfrey reviewed Nov 25, 2025

View reviewed changes

paul-aiyedun reviewed Nov 26, 2025

View reviewed changes

Always verify stats for TPCH benchmarks

92cb1f6


		set -e

		TPCH_TABLES=(customer lineitem nation orders part partsupp region supplier)

Add GPU benchmark stats guard #135

Are you sure you want to change the base?

Add GPU benchmark stats guard #135

Uh oh!

Conversation

patdevinwilson commented Nov 25, 2025

Uh oh!

misiugodfrey left a comment

Choose a reason for hiding this comment

Uh oh!

misiugodfrey Nov 25, 2025

Choose a reason for hiding this comment

Uh oh!

misiugodfrey Nov 25, 2025

Choose a reason for hiding this comment

Uh oh!

misiugodfrey Nov 25, 2025

Choose a reason for hiding this comment

Uh oh!

patdevinwilson commented Nov 25, 2025

Uh oh!

mattgara commented Nov 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

misiugodfrey commented Nov 25, 2025

Uh oh!

paul-aiyedun commented Nov 25, 2025

Uh oh!

misiugodfrey commented Nov 25, 2025

Uh oh!

paul-aiyedun Nov 26, 2025

Choose a reason for hiding this comment

Uh oh!

patdevinwilson Nov 26, 2025

Choose a reason for hiding this comment

Uh oh!

paul-aiyedun Nov 28, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

mattgara commented Nov 25, 2025 •

edited

Loading