Skip to content

Conversation

@c-ehrlich
Copy link
Collaborator

@c-ehrlich c-ehrlich commented Oct 24, 2025

This PR improves the Scorer API, making it easier to use, more type-safe, and just in general bringing it in line with other APIs in the SDK.

Cleaner Scorer Definition with Type Inference

Before:

const scorer = ({ output, expected }: { output: string; expected?: string }) => {
  if (!expected) {
    throw new Error('No expected value provided');
  }
  return {
    name: 'My Scorer',
    score: output === expected ? 1 : 0,
  };
};

After:

const scorer = Scorer(
  'My Scorer',
  ({ output, expected }: { output: string; expected: string }) => 
    output === expected ? 1 : 0
);

Better types

CleanShot 2025-10-29 at 17 49 47@2x

Name Property on Scorer Functions

Scorer names are now attached to the function itself as a property:

const myScorer = Scorer('Exact Match', ({ output, expected }) => ...);
console.log(myScorer.name); // 'Exact Match'

This means users don't need to worry about returning it in the scorer.

The name is also included in scorer span names now, ie they will be something like scorer exact-match

Support for Custom Extra Parameters

Scorers can now accept additional custom parameters beyond input, expected, and output:

const customScorer = Scorer(
  'Custom',
  ({ output, customProp }: { output: string; customProp: boolean }) => {
    return customProp ? 1 : 0;
  }
);

These type parameters are automatically inferred.

Better interop with autoevals scorers

Users now get

import { ExactMatch } from 'autoevals';

const WrappedExactMatch = Scorer(
  'Exact match',
  (args: {
    output: { response: string; category: string };
    expected: { response: string; category: string };
  }) => {
    return ExactMatch({
      output: args.output.response,
      expected: args.expected.response,
    });
  },
);

Unfortunately we can't make autoevals scorers plug and play, because their input/output args are all over the place.

Examples

The example-evals-nextjs scorers have been updated.

Internal improvements

  • Separated Score (without name) from ScoreWithName (internal use)
  • Improved type inference in createScorer factory
  • Added comprehensive type tests (scorer.types.test.ts)
  • CI: Moved format check before build

@pkg-pr-new
Copy link

pkg-pr-new bot commented Oct 24, 2025

Open in StackBlitz

npm i https://pkg.pr.new/axiomhq/ai/axiom@106

commit: a84bffa

@c-ehrlich c-ehrlich marked this pull request as ready for review October 28, 2025 13:21
Copilot AI review requested due to automatic review settings October 28, 2025 13:21
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refactors the scorer system to improve type inference and ergonomics. The key changes separate scorer definitions from usage requirements, introduce automatic type inference from function arguments, and simplify the API by removing the need for explicit type parameters in most cases.

  • Introduces ScorerLike for flexible scorer consumption and Scorer for strict scorer definitions with a name property
  • Implements automatic type inference in createScorer based on the callback's argument types
  • Updates scorer factory to attach the name as a property instead of embedding it in the returned score

Reviewed Changes

Copilot reviewed 11 out of 12 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
packages/ai/src/evals/scorers.ts Separates Score and ScoreWithName types, introduces ScorerLike vs Scorer distinction with TExtra support
packages/ai/src/evals/scorer.factory.ts Refactors createScorer to infer types from arguments and attach name as function property
packages/ai/test/evals/scorer.types.test.ts Adds comprehensive type inference tests for the new scorer system
packages/ai/src/evals/eval.types.ts Updates type references and strengthens OutputOf type constraints
packages/ai/src/evals/eval.ts Updates to use ScorerLike and extract scorer name from function property
packages/ai/src/evals/builder.ts Adds TODO comment about unused function
packages/ai/README.md Documents Node version requirement for evals
examples/example-evals-nextjs/test/feature.eval.ts Migrates to new Scorer factory pattern
examples/example-evals-nextjs/src/lib/scorers.ts Updates scorers to use typed arguments and removes optional expected handling
examples/example-evals-nextjs/src/lib/capabilities/classify-ticket/evaluations/ticket-classification.eval.ts Adds wrapped autoevals scorer example
.github/workflows/ci.yaml Reorders CI steps to run format check before build
Files not reviewed (1)
  • pnpm-lock.yaml: Language not supported

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor

@lukasmalkmus lukasmalkmus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Breaking change so we should probably merge it after your other PRs, potentially even have a release in between?

@c-ehrlich c-ehrlich merged commit cb95b46 into main Nov 3, 2025
6 checks passed
@c-ehrlich c-ehrlich deleted the better-scorers branch November 3, 2025 08:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants