Skip to content

Fix metrics assertions for Tinker-API based E2E CI runs #1664

@SumanthRH

Description

@SumanthRH

Summary

We introduced E2E CI for the Tinker API with the SkyRLTrainBackend in #1616 , but currently we don't have any assertions on the metric values:

# TODO: tighten thresholds after 3-5 nightly runs (5% allowance from min observed),
# matching the convention in gsm8k_colocate.sh.
REWARD_MIN_VALUE=0.0

We need to add assertions for expected metrics based on the results of 5 nightly runs

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions