Skip to content

[CICD] flagos user tests management#4

Merged
Darryl233 merged 15 commits intoflagos-ai:mainfrom
Darryl233:cicd
Mar 20, 2026
Merged

[CICD] flagos user tests management#4
Darryl233 merged 15 commits intoflagos-ai:mainfrom
Darryl233:cicd

Conversation

@Darryl233
Copy link
Copy Markdown
Collaborator

@Darryl233 Darryl233 commented Mar 18, 2026

Summary

Add flagos-user-tests subdirectory with a standardized user test framework supporting multi-repo test case management (flagscale, flagcx, flaggems, etc.)

  • Provide a complete test toolchain: test case collection (collect_test_cases.py), matrix resolution (resolve_matrix.py), test execution (run_user_tests.py), and Conda environment initialization (activate_conda.sh)
  • Add validator tools (validators/): config validation, gold values validation, and lint checks to enforce test case format conventions
  • Provide a test case template generator (generators/create_test_template.py) to simplify authoring new test cases
  • Add CI workflow post_test_cases.yml that automatically collects test case metadata and uploads it to the backend via the post-benchmark-report action after PRs are merged to main, enabling continuous tracking of the test case inventory
  • Include flagscale/inference/qwen3/demo_0_6b as a reference test case implementation
  • Add comprehensive documentation: README, CONTRIBUTING guide, test format spec (test_format_spec.md), and getting started guide (getting_started.md)
  • Fix report JSON output format in collect_test_cases.py to use object-of-objects structure required by post-benchmark-report action (was incorrectly outputting a JSON array)

@Darryl233 Darryl233 changed the title [wip] flagos user tests [CICD] flagos user tests management Mar 18, 2026
@xmhubj xmhubj self-requested a review March 19, 2026 12:00
@Darryl233 Darryl233 merged commit 7294dcd into flagos-ai:main Mar 20, 2026
3 checks passed
@Darryl233 Darryl233 deleted the cicd branch March 20, 2026 03:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants