KV cache creating steering differences

## Bug Description
When steering (`/steer/completion-chat`), when we don't utilize the KV cache, we get different results compared to when we do utilize the KV cache

## How To Reproduce Bug
On main:
1. Set `use_past_kv_cache` to `False` [here](https://github.com/shayansadeghieh/neuronpedia/blob/8605f63f76529a15e4b832a48a23b341f5450e43/apps/inference/neuronpedia_inference/endpoints/steer/completion_chat.py#L319)
2. From `apps/inference` run `poetry run pytest -s -k test_completion_chat_steered_with_features_additive`

You will fail the test. If you then set `use_past_kv_cache to True`, you'll pass the test.

## Expected Behavior
I would expect the KV cache shouldn't impact the steering. It should only speed up the computation 🤔 

## Additional Context
We are attempting to upgrade to transformerlens v3. But, transformer lens v3 no longer supports the `HookedTransformerKeyValueCache` class utilized in `HookedTransformer`. As a result, we will need to set kv cache as `false` within our fork of `transformerlens` (within `generate_stream` method), but unfortunately that generates the above behaviour. 

I did notice that Bryce just added the KV cache to transformer lens v3 yesterday, but haven't looked too much into it. See [here](https://github.com/TransformerLensOrg/TransformerLens/pull/1045).


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

KV cache creating steering differences #163

Bug Description

How To Reproduce Bug

Expected Behavior

Additional Context

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

KV cache creating steering differences #163

Description

Bug Description

How To Reproduce Bug

Expected Behavior

Additional Context

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions