feat: better scorers #106

c-ehrlich · 2025-10-24T00:58:26Z

This PR improves the Scorer API, making it easier to use, more type-safe, and just in general bringing it in line with other APIs in the SDK.

Cleaner Scorer Definition with Type Inference

Before:

const scorer = ({ output, expected }: { output: string; expected?: string }) => {
  if (!expected) {
    throw new Error('No expected value provided');
  }
  return {
    name: 'My Scorer',
    score: output === expected ? 1 : 0,
  };
};

After:

const scorer = Scorer(
  'My Scorer',
  ({ output, expected }: { output: string; expected: string }) => 
    output === expected ? 1 : 0
);

Better types

Name Property on Scorer Functions

Scorer names are now attached to the function itself as a property:

const myScorer = Scorer('Exact Match', ({ output, expected }) => ...);
console.log(myScorer.name); // 'Exact Match'

This means users don't need to worry about returning it in the scorer.

The name is also included in scorer span names now, ie they will be something like scorer exact-match

Support for Custom Extra Parameters

Scorers can now accept additional custom parameters beyond input, expected, and output:

const customScorer = Scorer(
  'Custom',
  ({ output, customProp }: { output: string; customProp: boolean }) => {
    return customProp ? 1 : 0;
  }
);

These type parameters are automatically inferred.

Better interop with `autoevals` scorers

Users now get

import { ExactMatch } from 'autoevals';

const WrappedExactMatch = Scorer(
  'Exact match',
  (args: {
    output: { response: string; category: string };
    expected: { response: string; category: string };
  }) => {
    return ExactMatch({
      output: args.output.response,
      expected: args.expected.response,
    });
  },
);

Unfortunately we can't make autoevals scorers plug and play, because their input/output args are all over the place.

Examples

The example-evals-nextjs scorers have been updated.

Internal improvements

Separated Score (without name) from ScoreWithName (internal use)
Improved type inference in createScorer factory
Added comprehensive type tests (scorer.types.test.ts)
CI: Moved format check before build

pkg-pr-new · 2025-10-24T00:59:44Z

Open in StackBlitz

npm i https://pkg.pr.new/axiomhq/ai/axiom@106

commit: a84bffa

Copilot

Pull Request Overview

This PR refactors the scorer system to improve type inference and ergonomics. The key changes separate scorer definitions from usage requirements, introduce automatic type inference from function arguments, and simplify the API by removing the need for explicit type parameters in most cases.

Introduces ScorerLike for flexible scorer consumption and Scorer for strict scorer definitions with a name property
Implements automatic type inference in createScorer based on the callback's argument types
Updates scorer factory to attach the name as a property instead of embedding it in the returned score

Reviewed Changes

Copilot reviewed 11 out of 12 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
packages/ai/src/evals/scorers.ts	Separates `Score` and `ScoreWithName` types, introduces `ScorerLike` vs `Scorer` distinction with `TExtra` support
packages/ai/src/evals/scorer.factory.ts	Refactors `createScorer` to infer types from arguments and attach name as function property
packages/ai/test/evals/scorer.types.test.ts	Adds comprehensive type inference tests for the new scorer system
packages/ai/src/evals/eval.types.ts	Updates type references and strengthens `OutputOf` type constraints
packages/ai/src/evals/eval.ts	Updates to use `ScorerLike` and extract scorer name from function property
packages/ai/src/evals/builder.ts	Adds TODO comment about unused function
packages/ai/README.md	Documents Node version requirement for evals
examples/example-evals-nextjs/test/feature.eval.ts	Migrates to new `Scorer` factory pattern
examples/example-evals-nextjs/src/lib/scorers.ts	Updates scorers to use typed arguments and removes optional expected handling
examples/example-evals-nextjs/src/lib/capabilities/classify-ticket/evaluations/ticket-classification.eval.ts	Adds wrapped autoevals scorer example
.github/workflows/ci.yaml	Reorders CI steps to run format check before build

Files not reviewed (1)

pnpm-lock.yaml: Language not supported

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

packages/ai/src/evals/scorer.factory.ts

packages/ai/src/evals/eval.ts

examples/example-evals-nextjs/test/feature.eval.ts

lukasmalkmus

Breaking change so we should probably merge it after your other PRs, potentially even have a release in between?

c-ehrlich added 7 commits October 23, 2025 20:22

better scorers

2439db7

improve scorer types and create scorer type tests

409c88a

better inference

ec950a8

better expected types

97f8704

types

0ffe4e7

Merge branch 'main' into better-scorers

bf3970e

add autoevals scorer

d255ac8

c-ehrlich added 5 commits October 24, 2025 11:02

undefined scorer params are unknown instead of never

2d7d9e7

fix another type issue

a9aa24c

note minimum node version in readme

183ec9c

build later

57982b7

need to build a bit earlier actually

697b9d2

c-ehrlich marked this pull request as ready for review October 28, 2025 13:21

Copilot AI review requested due to automatic review settings October 28, 2025 13:21

Copilot AI reviewed Oct 28, 2025

View reviewed changes

packages/ai/src/evals/scorer.factory.ts Show resolved Hide resolved

packages/ai/src/evals/eval.ts Outdated Show resolved Hide resolved

packages/ai/src/evals/eval.ts Outdated Show resolved Hide resolved

examples/example-evals-nextjs/test/feature.eval.ts Show resolved Hide resolved

c-ehrlich added 4 commits October 28, 2025 20:33

dont need this

90893ff

dont need this export

7640d7c

better scorer name handling

421210b

remove as any here

7668499

lukasmalkmus approved these changes Oct 31, 2025

View reviewed changes

c-ehrlich added 2 commits November 3, 2025 15:19

Merge branch 'main' into better-scorers

a0e3cfd

fix test types

a84bffa

c-ehrlich merged commit cb95b46 into main Nov 3, 2025
6 checks passed

c-ehrlich deleted the better-scorers branch November 3, 2025 08:30

axiom-automation mentioned this pull request Nov 3, 2025

chore(main): release axiom 0.23.0 #113

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: better scorers #106

feat: better scorers #106

Uh oh!

c-ehrlich commented Oct 24, 2025 •

edited

Loading

Uh oh!

pkg-pr-new bot commented Oct 24, 2025 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

lukasmalkmus left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat: better scorers #106

feat: better scorers #106

Uh oh!

Conversation

c-ehrlich commented Oct 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Cleaner Scorer Definition with Type Inference

Better types

Name Property on Scorer Functions

Support for Custom Extra Parameters

Better interop with autoevals scorers

Examples

Internal improvements

Uh oh!

pkg-pr-new bot commented Oct 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

lukasmalkmus left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

c-ehrlich commented Oct 24, 2025 •

edited

Loading

Better interop with `autoevals` scorers

pkg-pr-new bot commented Oct 24, 2025 •

edited

Loading