Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add documentation and sample on KV cache eviction #1960

Merged
merged 7 commits into from
Apr 2, 2025

Conversation

vshampor
Copy link
Contributor

No description provided.

## Conceptual Model
The KV cache for each sequence is divided into three logical areas:

![KV cache layout with cache eviction](./images/kv-cache-areas-diagram.svg)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Images should be placed at site/static/img directory and referenced here from this directory:

Suggested change
![KV cache layout with cache eviction](./images/kv-cache-areas-diagram.svg)
![KV cache layout with cache eviction](/img/kv-cache-areas-diagram.svg)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -0,0 +1,247 @@

import gc
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is an example that should be moved to the samples folder.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@AlexKoff88 AlexKoff88 requested a review from l-bat March 26, 2025 12:26
Copy link

@MaximProshin MaximProshin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM


* Start Area: Initial tokens that are never evicted
* Evictable Area: Tokens that can be evicted based on importance scores
* Recent Area: Most recent tokens that are preserved (never evicted)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* Recent Area: Most recent tokens that are preserved (never evicted)
* Recent Area: Most recent tokens that are preserved but migrate to the Evictable Area with iterations of text generation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@AlexKoff88
Copy link
Contributor

Well-written! I would add a paragraph at the end like "Areas/Subjects for improvements".

@vshampor
Copy link
Contributor Author

Well-written! I would add a paragraph at the end like "Areas/Subjects for improvements".

Done

@eaidova eaidova merged commit 59b4c90 into openvinotoolkit:master Apr 2, 2025
34 of 54 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants