added support for transcoders to testing and steering endpoints by curt-tigges · Pull Request #105 · hijohnnylin/neuronpedia

curt-tigges · 2025-05-26T23:19:55Z

Problem

We did not support activation testing with transcoders, due to differences in their hook points.

Fix

The SAE manager and "saelens" wrapper have been updated to load transcoders from SAELens, handle their hook points correctly, and use them in activation testing. It is also possible to "steer" with them, but this is currently invalid/undefined and steering should be turned off on transcoders until we figure out what kind of steering would be meaningful.

Testing

I tested this on known features for which we have dashboards, confirming that POST requests to the inference server gave us the expected values. E.g.:

curl -X POST http://127.0.0.1:5002/v1/activation/single \
-H "Content-Type: application/json" \
-d '{
 "prompt": "I love dogs!",
 "model": "gemma-2-2b",
 "source": "0-gemmascope-transcoder-16k",
 "index": "3"
}'

This should show a high activation on the word "love."

vercel · 2025-05-26T23:19:59Z

Someone is attempting to deploy a commit to the Neuronpedia Team on Vercel.

A member of the Team first needs to authorize it.

hijohnnylin · 2025-06-18T05:13:44Z

Thanks! @curt-tigges this is blocked on transcoders being merged into SAELens right?

hijohnnylin · 2025-07-16T05:59:07Z

Moving to draft until we are unblocked on SAELens (which should be soon): decoderesearch/SAELens#182

added support for transcoders to testing and steering endpoints

13f63f9

Curt Tigges and others added 2 commits May 27, 2025 12:39

working activation test endpoints

55c332d

Merge branch 'main' into feature/support_transcoder_inference_steering

c245a60

curt-tigges marked this pull request as ready for review May 27, 2025 23:53

hijohnnylin marked this pull request as draft July 16, 2025 05:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

added support for transcoders to testing and steering endpoints#105

added support for transcoders to testing and steering endpoints#105
curt-tigges wants to merge 3 commits intohijohnnylin:mainfrom
curt-tigges:feature/support_transcoder_inference_steering

curt-tigges commented May 26, 2025 •

edited

Loading

Uh oh!

vercel bot commented May 26, 2025

Uh oh!

hijohnnylin commented Jun 18, 2025

Uh oh!

hijohnnylin commented Jul 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

curt-tigges commented May 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Fix

Testing

Uh oh!

vercel bot commented May 26, 2025

Uh oh!

hijohnnylin commented Jun 18, 2025

Uh oh!

hijohnnylin commented Jul 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

curt-tigges commented May 26, 2025 •

edited

Loading