Skip to content

Add integration tests for skill templates#15131

Open
rishic3 wants to merge 9 commits into
NVIDIA:mainfrom
rishic3:skills-tests
Open

Add integration tests for skill templates#15131
rishic3 wants to merge 9 commits into
NVIDIA:mainfrom
rishic3:skills-tests

Conversation

@rishic3

@rishic3 rishic3 commented Jun 23, 2026

Copy link
Copy Markdown
Collaborator

Contributes to #15013.

Description

Adds integration tests for skill templates. (These were originally included in #15058 but were deferred into this separate PR). The tests basically verify that the templates work out of the box: builds work, scripts run, validations catch errors they're supposed to catch, etc.

The tests create a dummy project, deterministically fill in all of the TODO methods (using the method stubs under fixtures) and add UDF implementations (using the complete source files under resources), and then run the sequence of scripts that the agent would go through.

Checklists

Documentation

  • Updated for new or modified user-facing features or behaviors
  • No user-facing change

Testing

  • Added or modified tests to cover new code paths
  • Covered by existing tests
    (Please provide the names of the existing tests in the PR description.)
  • Not required

Performance

  • Tests ran and results are added in the PR description
  • Issue filed with a link in the PR description
  • Not required

@greptile-apps

greptile-apps Bot commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

Adds a new skills/tests/ package with integration tests that verify the JVM skill templates work end-to-end: each test parameterizes over (language, target) combinations, copies the template into a temp directory, deterministically fills ??? stubs with fixture implementations, and then drives Maven/shell scripts through the full compile → test → benchmark pipeline.

  • test_jvm_templates.py covers 6 parameterized build/run scenarios plus two deliberate-failure scenarios (broken GPU output caught by comparison tests, broken schema caught by GenData --validate).
  • test_frontmatter.py adds a lightweight parametrized check that every SKILL.md frontmatter is valid YAML containing name and description.
  • pyproject.toml and TESTING.md wire up the test environment and document the slow marker required to invoke the GPU-dependent tests.

Confidence Score: 5/5

This PR adds only new test infrastructure and resource files; it has no impact on any production code path.

All changes are test scaffolding. The fixture lifecycle is correct (try/finally ensures temp-dir cleanup even when setup raises before yield), the regex stub-replacement logic is sound, and the CUDA/JNI resource files follow cuDF memory conventions properly. No production code is touched.

No files require special attention. The one note is a cosmetic dead branch in project_with_broken_gpu inside test_jvm_templates.py.

Important Files Changed

Filename Overview
skills/tests/test_jvm_templates.py Main integration test file: fixtures create parameterized temp project copies, fill stubs, and test compile/run/error-detection scenarios across (lang × target) combinations. One cosmetic dead branch noted in project_with_broken_gpu.
skills/tests/fixtures.py Source fixtures (Scala/Java/SQL/C++ snippets) used to fill template stubs deterministically; register-call helpers are correctly split between Scala (no explicit return type) and Java (explicit IntegerType).
skills/tests/utils.py Shared helpers: run_mvn (sets cwd, -q), run_script (no cwd; script self-navigates per previous resolution), and replace_scala_todo_method (DOTALL regex, anchored to method name + ???).
skills/tests/test_frontmatter.py Lightweight parametrized YAML frontmatter validator for all SKILL.md files; checks name and description fields are present and are strings.
skills/pyproject.toml New pyproject.toml introducing the aether-agent package; pins dev dependencies and registers the slow pytest marker used by integration tests.
skills/tests/resources/IntegerMultiplyBy2Jni.cpp JNI bridge for the CUDA UDF test resource; validates input type and correctly releases the raw column pointer via result.release().
skills/tests/resources/integer_multiply_by_2.cu CUDA kernel resource: copies null mask, allocates result column, launches multiply-by-2 kernel with CUDF_CHECK_CUDA, and correctly threads stream/MR through the cuDF APIs.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[pytest collects test_jvm_templates.py] --> B{Fixture type}

    B -->|project_dir\nscope=module| C[_build_project_dir\ncopy template → tmpdir]
    B -->|project_with_fixtures\nscope=module\nparams=LANG×TARGET| D[_build_project_dir\n+ _fill_stubs]
    B -->|project_with_broken_gpu\nscope=class\nparams=TARGETS| E[_build_project_dir\n+ _fill_stubs\n+ break GPU source]
    B -->|project_with_broken_schema\nscope=class| F[_build_project_dir\n+ _fill_stubs cudf\n+ corrupt BenchUtils]

    C --> G[TestCompilation\ntest_compile_smoke\nmvn clean compile]
    D --> H[TestCompilation\ntest_compile_with_fixtures\nmvn clean test-compile]
    D --> I[TestComparisonTest\ntest_run_comparison_test\nmvn test -Dsuites=...]
    D --> J[TestBench\ntest_validate\nmvn exec:java GenData --validate]
    D --> K[TestBench\ntest_spark_e2e\nrun_gen_data.sh + run_spark_benchmark.sh]
    D --> L[TestBench\ntest_micro_e2e\nrun_gen_data.sh + run_micro_benchmark.sh\nskip if sql]
    E --> M[TestErrors\ntest_comparison_catches_gpu_error\nassert returncode≠0, 246+369 in output]
    F --> N[TestErrors\ntest_bench_validate_catches_error\nassert returncode≠0, AnalysisException in stderr]
    G & H & I & J & K & L & M & N --> Z[Fixture teardown\nshutil.rmtree tmpdir]
Loading
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
flowchart TD
    A[pytest collects test_jvm_templates.py] --> B{Fixture type}

    B -->|project_dir\nscope=module| C[_build_project_dir\ncopy template → tmpdir]
    B -->|project_with_fixtures\nscope=module\nparams=LANG×TARGET| D[_build_project_dir\n+ _fill_stubs]
    B -->|project_with_broken_gpu\nscope=class\nparams=TARGETS| E[_build_project_dir\n+ _fill_stubs\n+ break GPU source]
    B -->|project_with_broken_schema\nscope=class| F[_build_project_dir\n+ _fill_stubs cudf\n+ corrupt BenchUtils]

    C --> G[TestCompilation\ntest_compile_smoke\nmvn clean compile]
    D --> H[TestCompilation\ntest_compile_with_fixtures\nmvn clean test-compile]
    D --> I[TestComparisonTest\ntest_run_comparison_test\nmvn test -Dsuites=...]
    D --> J[TestBench\ntest_validate\nmvn exec:java GenData --validate]
    D --> K[TestBench\ntest_spark_e2e\nrun_gen_data.sh + run_spark_benchmark.sh]
    D --> L[TestBench\ntest_micro_e2e\nrun_gen_data.sh + run_micro_benchmark.sh\nskip if sql]
    E --> M[TestErrors\ntest_comparison_catches_gpu_error\nassert returncode≠0, 246+369 in output]
    F --> N[TestErrors\ntest_bench_validate_catches_error\nassert returncode≠0, AnalysisException in stderr]
    G & H & I & J & K & L & M & N --> Z[Fixture teardown\nshutil.rmtree tmpdir]
Loading

Reviews (5): Last reviewed commit: "updates for recent method changes" | Re-trigger Greptile

Comment thread skills/tests/utils.py
Comment thread skills/tests/utils.py

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new skills/tests/ pytest suite that integration-tests the JVM skill templates end-to-end by materializing a template project, filling in TODO stubs with deterministic fixtures (including CPU/RapidsUDF/SQL/CUDA-native variants), and then running mvn builds plus the provided benchmark scripts to ensure the templates work “out of the box”.

Changes:

  • Add integration tests that generate a temporary project from the JVM templates, fill stubs, build, and execute template test/benchmark flows for (scala|java) x (cudf|sql|cuda).
  • Add fixtures and resource source files used to populate template TODOs and native CUDA/JNI placeholders.
  • Add skill frontmatter parsing tests, plus Python dev/test configuration and a testing doc.

Reviewed changes

Copilot reviewed 16 out of 16 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
skills/tests/utils.py Test helpers for running Maven/scripts and patching Scala TODO stubs.
skills/tests/test_jvm_templates.py Main integration test suite that builds/runs the materialized template project across targets.
skills/tests/test_frontmatter.py Lightweight test to ensure all SKILL.md YAML frontmatter parses and has key fields.
skills/tests/fixtures.py Loads resource files and provides stub method implementations used to fill template TODOs.
skills/tests/init.py Marks skills/tests as a Python package for relative imports.
skills/tests/resources/IntegerMultiplyBy2UDF.scala Scala CPU UDF reference implementation used by the tests.
skills/tests/resources/IntegerMultiplyBy2UDF.java Java CPU UDF reference implementation used by the tests.
skills/tests/resources/IntegerMultiplyBy2RapidsUDF.scala Scala RapidsUDF implementation used by the tests.
skills/tests/resources/IntegerMultiplyBy2RapidsUDF.java Java RapidsUDF implementation used by the tests.
skills/tests/resources/IntegerMultiplyBy2NativeRapidsUDF.java Java wrapper for the native CUDA UDF used by the tests.
skills/tests/resources/IntegerMultiplyBy2Jni.cpp JNI bridge source used by the native CUDA target tests.
skills/tests/resources/integer_multiply_by_2.sql SQL implementation used by the SQL target tests.
skills/tests/resources/integer_multiply_by_2.hpp Native header used by the CUDA target tests.
skills/tests/resources/integer_multiply_by_2.cu Native CUDA implementation used by the CUDA target tests.
skills/pyproject.toml Python dev dependencies + pytest marker registration for skills/ tests.
skills/docs/dev/TESTING.md Developer docs for running fast vs integration tests under skills/.

Comment thread skills/docs/dev/TESTING.md Outdated
Comment thread skills/docs/dev/TESTING.md Outdated
Comment thread skills/tests/fixtures.py Outdated
Comment thread skills/tests/test_jvm_templates.py
Comment thread skills/tests/test_jvm_templates.py
rishic3 and others added 3 commits June 24, 2026 09:04
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants