Replies: 2 comments
-
If I were to guess what I have to do...
If I'm not correct, I would appreciate correction. Updates: I did learn about KV caches and attention masks. I maybe need to store the whole state here. Logits can be stored by giving a decoder batch where it has been instructed to save logits. Then they're in state image and I can truncate the sequence to previous logit boundary. This is what I'm going to try eventually. |
Beta Was this translation helpful? Give feedback.
-
I was thinking on this and realising this kind of a chat log forms a tree structure. Each call to generate or user appended message is a new leaf to the tree. It would really help me out if I knew more about the structure of context.
|
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
I'm planning to make a chat session where you can rewind the discussion and continue from there. I was planning of using save states for this, but saving the whole state every time is very expensive.
I'm not entirely clear on how the state works. Would it be possible to piece the state itself such that it is rewindable and that I would only have to store one state per session?
Beta Was this translation helpful? Give feedback.
All reactions