Add more logging to sharktank data tests #1017

erman-gurses · 2025-03-03T08:02:22Z

Add more logging to sharktank data tests
It continues on this PR #917

erman-gurses · 2025-03-03T08:04:09Z

Will double check and add more logs for the other tests.

erman-gurses · 2025-03-04T16:00:04Z

@ScottTodd , raised a draft PR, let me know if you have any feedback. I can iterate over that.

ScottTodd · 2025-03-04T17:14:41Z

Can you link to example logs before/after in the PR description? That would make review easier.

erman-gurses · 2025-03-06T07:56:16Z

Before logging: https://github.com/nod-ai/shark-ai/actions/runs/13727645858/job/38397644679?pr=1017#step:7:108
After logging: https://github.com/nod-ai/shark-ai/actions/runs/13667985379/job/38212871488?pr=1017#step:7:282
My understanding is that the position of those logs (below ones) is wrong. Will correct them tomorrow.

INFO:tests.models.llama.benchmark_amdgpu_test BenchmarkLlama3_1_8B...

INFO:tests.models.llama.benchmark_amdgpu_test Testing Benchmark8B_f16_Non_Decomposed_Input_Len_128...

INFO:tests.models.llama.benchmark_amdgpu_test Testing Benchmark8B_f16_Non_Decomposed_Input_Len_2048...

INFO:tests.models.llama.benchmark_amdgpu_test Testing Benchmark8B_fp8_Non_Decomposed...

INFO:tests.models.llama.benchmark_amdgpu_test Starting BenchmarkLlama3_1_70B...

INFO:tests.models.llama.benchmark_amdgpu_test Testing Benchmark70B_f16_TP8_Non_Decomposed_Input_Len_1[28](https://github.com/nod-ai/shark-ai/actions/runs/13667985379/job/38212871488?pr=1017#step:7:29)...

INFO:tests.models.llama.benchmark_amdgpu_test Testing Benchmark70B_f16_TP8_Non_Decomposed_Input_Len_2048128...

INFO:tests.models.llama.benchmark_amdgpu_test Testing Benchmark70B_fp8_TP8_Decomposed128...

INFO:tests.models.llama.benchmark_amdgpu_test Testing Benchmark70B_fp8_TP8_Non_Decomposed...

INFO:tests.models.llama.benchmark_amdgpu_test Starting Benchmark Llama3_1_405B...

INFO:tests.models.llama.benchmark_amdgpu_test Testing Benchmark405B_f16_TP8_Non_Decomposed_Input_Len_128...

INFO:tests.models.llama.benchmark_amdgpu_test Testing Benchmark[40](https://github.com/nod-ai/shark-ai/actions/runs/13667985379/job/38212871488?pr=1017#step:7:41)5B_f16_TP8_Non_Decomposed_Input_Len_2048...

INFO:tests.models.llama.benchmark_amdgpu_test Testing Benchmark405B_fp8_TP8_Decomposed...

INFO:tests.models.llama.benchmark_amdgpu_test Testing Benchmark405B_fp8_TP8_Non_Decomposed...

Also added those kinds logs:
I believe their positions are correct.

INFO:tests.models.llama.benchmark_amdgpu_test Compiling MLIR file...
INFO:tests.models.llama.benchmark_amdgpu_test IREE Benchmark Prefill...
INFO:tests.models.llama.benchmark_amdgpu_test IREE Benchmark Decode...
...

Signed-off-by: erman-gurses <[email protected]>

ScottTodd

Sorry this took so long to review. The extra logs are generally good to add. I would at least remove the class-scoped logging though.

ScottTodd · 2025-03-07T22:11:51Z

sharktank/tests/models/llama/benchmark_amdgpu_test.py

 class BenchmarkLlama3_1_8B(BaseBenchmarkTest):
+    logger.info("Testing BenchmarkLlama3_1_8B...")


I would remove these class-scoped log lines and instead rely on the -v option already included on the pytest commands in the workflow files.

I don't think logs at the class scope are doing what you want / are useful: https://github.com/nod-ai/shark-ai/actions/runs/13667985379/job/38212871488?pr=1017#step:7:22

snippet:

collecting ... ----------------------------- live log collection ------------------------------ INFO:tests.models.llama.benchmark_amdgpu_test BenchmarkLlama3_1_8B... INFO:tests.models.llama.benchmark_amdgpu_test Testing Benchmark8B_f16_Non_Decomposed_Input_Len_128... INFO:tests.models.llama.benchmark_amdgpu_test Testing Benchmark405B_fp8_TP8_Non_Decomposed... collected 11 items

Pytest has a few phases while running tests. This is the "collection" phase, which is where pytest discovers test cases from the directory tree based on any conftest.py files, setting/configuration files, and command line options. These log lines are running during that collection phase, when you really want them when the tests are actually starting. Only after collection completes are individual tests actually ran, then there are a few other phases.

I can't find docs on the phases, but https://docs.pytest.org/en/stable/reference/reference.html#hooks is close enough to get the point across:

bootstrapping

initialization

collection

test running

reporting

However, pytest can already log when a test starts. See https://docs.pytest.org/en/stable/how-to/output.html and the -v option in particular which changes from

=========================== test session starts ============================ collected 4 items test_verbosity_example.py .FFF [100%] ================================= FAILURES =================================

to

=========================== test session starts ============================ collecting ... collected 4 items test_verbosity_example.py::test_ok PASSED [ 25%] test_verbosity_example.py::test_words_fail FAILED [ 50%] test_verbosity_example.py::test_numbers_fail FAILED [ 75%] test_verbosity_example.py::test_long_text_fail FAILED [100%] ================================= FAILURES =================================

ScottTodd · 2025-03-07T22:15:29Z

sharktank/tests/models/llama/benchmark_amdgpu_test.py

+        logger.info("Compiling MLIR file...")
        self.llama8b_f16_torch_sdpa_artifacts.compile_to_vmfb(


This helper already does some logging, but the extra logs don't hurt. What we could do in these places is provide context for what or why we're compiling/running. The "IREE Benchmark Prefill..." logs you added are a good example of that. Just seeing iree-run-module ... in the logs doesn't immediately provide such context, but "benchmark prefill" does.

INFO:tests.models.llama.benchmark_amdgpu_test Compiling MLIR file... INFO:eval Launching compile command: cd /home/runner/_work/shark-ai/shark-ai && iree-compile /home/runner/_work/shark-ai/shark-ai/2025-03-05/llama-8b/f16_torch_128.mlir --iree-hip-target=gfx942 -o=/home/runner/_work/shark-ai/shark-ai/2025-03-05/llama-8b/f16_torch_128.vmfb --iree-hal-target-device=hip --iree-hal-dump-executable-files-to=/home/runner/_work/shark-ai/shark-ai/2025-03-05/llama-8b/f16_torch_128/files --iree-dispatch-creation-enable-aggressive-fusion=true --iree-global-opt-propagate-transposes=true --iree-opt-aggressively-propagate-transposes=true --iree-opt-data-tiling=false --iree-preprocessing-pass-pipeline='builtin.module(util.func(iree-preprocessing-generalize-linalg-matmul-experimental))' --iree-stream-resource-memory-model=discrete --iree-hal-indirect-command-buffers=true --iree-hal-memoization=true --iree-opt-strip-assertions INFO:eval compile_to_vmfb: 16.77 secs

ScottTodd · 2025-03-07T22:17:51Z

sharktank/tests/models/llama/benchmark_amdgpu_test.py

        # benchmark prefill
+        logger.info("IREE Benchmark Prefill...")


nit: when comments and logs are saying the same thing, you can remove the comments

erman-gurses added the infra General category for infrastructure-related requests for common triaging and prioritization label Mar 3, 2025

erman-gurses requested review from ScottTodd and aviator19941 March 3, 2025 08:03

Re-organize logs

b086fdf

Signed-off-by: erman-gurses <[email protected]>

erman-gurses force-pushed the users/erman-gurses/add-more-logging branch from 296bb2d to b086fdf Compare March 7, 2025 07:47

Formatting

a09d45b

Signed-off-by: erman-gurses <[email protected]>

erman-gurses force-pushed the users/erman-gurses/add-more-logging branch from d854f6a to a09d45b Compare March 7, 2025 16:24

erman-gurses added 2 commits March 7, 2025 12:52

Add the log flag to pytest

7a32064

Signed-off-by: erman-gurses <[email protected]>

Update the text of the logger

1e9e2db

Signed-off-by: erman-gurses <[email protected]>

ScottTodd reviewed Mar 7, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add more logging to sharktank data tests #1017

Add more logging to sharktank data tests #1017

erman-gurses commented Mar 3, 2025 •

edited

Loading

erman-gurses commented Mar 3, 2025 •

edited

Loading

erman-gurses commented Mar 4, 2025

ScottTodd commented Mar 4, 2025

erman-gurses commented Mar 6, 2025 •

edited

Loading

ScottTodd left a comment

ScottTodd Mar 7, 2025

ScottTodd Mar 7, 2025

ScottTodd Mar 7, 2025

		class BenchmarkLlama3_1_8B(BaseBenchmarkTest):
		logger.info("Testing BenchmarkLlama3_1_8B...")

		logger.info("Compiling MLIR file...")
		self.llama8b_f16_torch_sdpa_artifacts.compile_to_vmfb(

Add more logging to sharktank data tests #1017

Are you sure you want to change the base?

Add more logging to sharktank data tests #1017

Conversation

erman-gurses commented Mar 3, 2025 • edited Loading

erman-gurses commented Mar 3, 2025 • edited Loading

erman-gurses commented Mar 4, 2025

ScottTodd commented Mar 4, 2025

erman-gurses commented Mar 6, 2025 • edited Loading

ScottTodd left a comment

Choose a reason for hiding this comment

ScottTodd Mar 7, 2025

Choose a reason for hiding this comment

ScottTodd Mar 7, 2025

Choose a reason for hiding this comment

ScottTodd Mar 7, 2025

Choose a reason for hiding this comment

erman-gurses commented Mar 3, 2025 •

edited

Loading

erman-gurses commented Mar 3, 2025 •

edited

Loading

erman-gurses commented Mar 6, 2025 •

edited

Loading