Give the possibility to obtain the full response when calling the vLLM generate function

I'm using [InspectAI](https://inspect.ai-safety-institute.org.uk/) to evaluate language models. In particular, I'm evaluating the benefits of structured text generation using Outlines with language models. I would like to obtain the full response when calling the vLLM generate function since InspectAI expects to get the full response. Would it be possible to give the possibility to the user to get the full response. The default should still be the same as now which is a filtered response.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Give the possibility to obtain the full response when calling the vLLM generate function #1199

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Give the possibility to obtain the full response when calling the vLLM generate function #1199

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions