You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Today, batch generation works like the HF generate() function: it accepts several input texts but generation parameters (like temperature, top k, etc.) apply to the whole batch, so it is not possible to use different parameters within the same batch.
Is it because using different parameters in the same batch would degrade performance so much that it would defeat the purpose of batch generation?
Ideally it would be awesome if one could do something like this:
Hello team,
Today, batch generation works like the HF generate() function: it accepts several input texts but generation parameters (like temperature, top k, etc.) apply to the whole batch, so it is not possible to use different parameters within the same batch.
Is it because using different parameters in the same batch would degrade performance so much that it would defeat the purpose of batch generation?
Ideally it would be awesome if one could do something like this:
For example this is something that can be achieved with NVIDIA Faster Transformers: https://github.com/NVIDIA/FasterTransformer/blob/main/examples/pytorch/gpt/gpt_example.py
Thank you!
The text was updated successfully, but these errors were encountered: