Skip to content

Streaming but it's not actually clearing kvbuffer or using sliding windows? #41

@tommedema

Description

@tommedema

It seems like the current implementation is forever increasing the kvbuffer, which seems to go against streaming principles where you'd want to be able to have a continuous pipe and not blow up VRAM? I.e. when streaming we'd want to keep vram low by using a sliding window of context, no?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions