Skip to content

test: switch LLM API tests to qwen3.7-max#991

Merged
cmgzn merged 6 commits into
mainfrom
chore/update-llm-test-model-qwen36-plus
Jun 17, 2026
Merged

test: switch LLM API tests to qwen3.7-max#991
cmgzn merged 6 commits into
mainfrom
chore/update-llm-test-model-qwen36-plus

Conversation

@cmgzn

@cmgzn cmgzn commented Jun 10, 2026

Copy link
Copy Markdown
Collaborator

Updates LLM API-based tests to use qwen3.7-max instead of qwen2.5-72b-instruct.

This avoids failures caused by restricted or unavailable access to the previous model in CI.

Validation:

  • Ran lightweight Python compile check for affected test directories
  • Verified no lint errors
  • Did not run full test suite locally; full validation should run in official CI

@cmgzn cmgzn marked this pull request as ready for review June 10, 2026 05:59

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the API model and tokenizer name from 'qwen2.5-72b-instruct' to 'qwen3.6-plus' across multiple test files. However, 'qwen3.6-plus' appears to be an invalid or non-existent model name, which will cause API calls and tokenizer loading to fail during test execution. It is recommended to correct this to a valid model name, such as 'qwen-plus'.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread tests/ops/aggregator/test_entity_attribute_aggregator.py Outdated
Comment thread tests/ops/mapper/test_text_chunk_mapper.py Outdated

@fengrui-z fengrui-z left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code uses qwen3.7-max everywhere, but PR title says qwen3.6-plus. Fix the title.

sampling_params is misaligned in several dialog_* test files — not matching the other keyword args.

Unrelated changes:

  • uv.lock swaps bs4beautifulsoup4 — not related to model migration.
  • test_llm_analysis_filter.py rewrites RFT test data and adds min_score=0.7 — behavioral change beyond a model swap.

PR description says tests weren't run locally. qwen3.7-max may produce different output formats; recommend at least running the affected tests before merge.

@cmgzn cmgzn changed the title test: switch LLM API tests to qwen3.6-plus test: switch LLM API tests to qwen3.7-max Jun 16, 2026
@cmgzn

cmgzn commented Jun 16, 2026

Copy link
Copy Markdown
Collaborator Author

Thanks for the thorough review! Here are my responses:

  1. PR title says qwen3.6-plus but code uses qwen3.7-max: The original qwen3.6-plus had 2–3 tests that repeatedly failed, so I switched to qwen3.7-max. I'll update the PR title and description once all tests pass.

  2. uv.lock swaps bs4 → beautifulsoup4: This is a leftover from PR Replace bs4 stub with beautifulsoup4 in dependencies #977 which swapped "bs4" for "beautifulsoup4" in pyproject.toml dependencies but didn't sync uv.lock. Opening a separate PR for this seems not worth it, so I've included the fix here.

  3. test_llm_analysis_filter.py RFT test data rewrite & min_score=0.7: After the model swap, the RFT tests became flaky — LLM scoring tests are inherently unstable. I adjusted the test data to widen the quality gap between samples and raised min_score to improve reliability.

  4. sampling_params misalignment in dialog_ test files*: Fixed! All sampling_params keyword args are now properly aligned with other keyword arguments across test_dialog_topic_detection_mapper.py, test_dialog_sentiment_detection_mapper.py, test_dialog_intent_detection_mapper.py, and test_dialog_sentiment_intensity_mapper.py.

@cmgzn cmgzn deployed to Testing June 16, 2026 03:20 — with GitHub Actions Active

@fengrui-z fengrui-z left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@cmgzn cmgzn merged commit e622254 into main Jun 17, 2026
7 of 8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants