Conversation
Forward ref_audio and ref_text to model.generate() when the model's generate() signature accepts them (checked via inspect.signature, consistent with existing voice/instruct/speed pattern). Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
Voice cloning produces garbled output when ref_text doesn't match the reference audio. Make ref_text mandatory to prevent silent quality failures — discovered during manual testing. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
|
Thanks for building on the voice clone work — the generation params, One concern with the VoiceDesign routing: The The problem with Suggestion: remove the |
|
Thanks for the PR. The generation params forwarding, FakeModel test fixture, and model.sample_rate usage are all good improvements that i want to keep. @ethannortharc's feedback on the VoiceDesign routing is correct. The existing tts.py already uses The specific problem: CustomVoice models inherit Suggestion: drop the Also, now that #676 is merged, this branch will need a rebase since both PRs add |
Extends PR #676's voice cloning support with two additional Qwen3-TTS capabilities:
generate_voice_design()and instructions are provided, routes through the dedicated VD path instead ofstandard generate()temperature,top_k,top_p,repetition_penalty, andmax_tokensare accepted in the API schema and forwarded to whichever generation path is usedAlso fixes test fixtures to use plain FakeModel instead of MagicMock,preventing false hasattr() positives for VoiceDesign detection.
You can test this with the following CLI commands:
And cloning works with the designed voice, thanks to PR #676