Skip to content

Make it possible to pass a logits processor to Generator #1487

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
rlouf opened this issue Mar 8, 2025 · 2 comments · Fixed by #1494
Closed

Make it possible to pass a logits processor to Generator #1487

rlouf opened this issue Mar 8, 2025 · 2 comments · Fixed by #1494
Labels
Milestone

Comments

@rlouf
Copy link
Member

rlouf commented Mar 8, 2025

See for instance #1408, but also for internal testing. This is an important design decisions, we need to consider pros & cons of each possible choice.

@rlouf rlouf added enhancement structured generation Linked to structured generation impact/user labels Mar 8, 2025
@rlouf rlouf added this to the 1.0 milestone Mar 8, 2025
@RobinPicard
Copy link
Contributor

RobinPicard commented Mar 12, 2025

I've also run into situations in which it would be convenient to be able to do that. I see no problems with it apart from risking to make the interface harder to understand for users.

For the implementation, I think it would make sense to keep only the BlackBoxGenerator and SteerableGenerator classes as objects later called by the user as the separation between them is made based on the type of model instead of the type of output type (so handle within them the different types of output types). I see 2 areas in which we must consider different options:

User entry point

1. The user passes their logits processor directly to Generator

Pros:

  • Does not create additional entities
  • Simple to use

Cons:

  • Can be confusing to users as it requires widening the range of possible values provided as arguments to the main entrypoint function Generator

2. Create a separate function LogitsProcessorGenerator that would accept only logits processors as argument on top of a model. It would then return an instance of SteerableGenerator just like Generator

Pros:

  • Does not add any complexity for basic users
  • Provide a straightforward object to use for advanced users

Cons:

  • Add another object to the interface

3. Make users directly interact with the SteerableGenerator if they want to pass a logits processor to the generator

Pros:

  • Does not add any complexity for basic users
  • Does not create additional entities

Cons:

  • Harder to use and understand. Do we want to have users directly interact with an additional object?

Implementation in SteerableGenerator

1. Use the __post_init__ method by adding a conditional to treat the logits processor value type of the output_type argument

Pros:

  • Does not add more methods to the class
  • Straightforward to implement (only an if and 2 lines)

Cons:

  • Makes the __post_init__ heavier and potentially harder to understand

2. Create a class method with_logits_processor to handle creating a SteerableGenerator instance with a logits processor as an output type.

Pros:

  • Does not require adding complexity to the __post_init__ method
  • Clearly separate the handling of regular output types vs. logits processor

Cons:

  • Adds another method to the class and can make it harder to understand/use

PS: something we may want to take into consideration is the handling of an interegular FSM as an another type of output type. It's currently an accepted output type using option 1 and 1.

@rlouf
Copy link
Member Author

rlouf commented Mar 12, 2025

We could have users pass the logits processor as a keyword-only processor argument to the Generator object. I'm not a fan of introducing a new object or forcing users to know about the internals.

As for the implementation, I like the classmethod solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants