Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] TypeError: expected str, bytes or os.PathLike object, not NoneType when launching model #3271

Closed
faallaaf opened this issue Nov 20, 2023 · 6 comments · Fixed by idiap/coqui-ai-TTS#252
Labels
bug Something isn't working wontfix This will not be worked on but feel free to help.

Comments

@faallaaf
Copy link

Describe the bug

When launching model (bark, tried download via python3 TTS/server/server.py --model_name tts_models/multilingual/multi-dataset/bark --use_cuda true and add manualy)
When trying to launch, got a Traceback (most recent call last): File "/root/TTS/server/server.py", line 104, in <module> synthesizer = Synthesizer( File "/root/TTS/utils/synthesizer.py", line 93, in __init__ self._load_tts(tts_checkpoint, tts_config_path, use_cuda) File "/root/TTS/utils/synthesizer.py", line 183, in _load_tts self.tts_config = load_config(tts_config_path) File "/root/TTS/config/__init__.py", line 85, in load_config ext = os.path.splitext(config_path)[1] File "/usr/lib/python3.10/posixpath.py", line 118, in splitext p = os.fspath(p) TypeError: expected str, bytes or os.PathLike object, not NoneType

Windows 11, wsl docker

To Reproduce

Launch at docker and try use model bark

Expected behavior

No response

Logs

No response

Environment

{
    "CUDA": {
        "GPU": [
            "NVIDIA GeForce RTX 3070"
        ],
        "available": true,
        "version": "11.8"
    },
    "Packages": {
        "PyTorch_debug": false,
        "PyTorch_version": "2.1.1+cu118",
        "TTS": "0.20.6",
        "numpy": "1.22.0"
    },
    "System": {
        "OS": "Linux",
        "architecture": [
            "64bit",
            "ELF"
        ],
        "processor": "x86_64",
        "python": "3.10.12",
        "version": "#1 SMP Thu Oct 5 21:02:42 UTC 2023"
    }
}

Additional context

No response

@faallaaf faallaaf added the bug Something isn't working label Nov 20, 2023
@erogol
Copy link
Member

erogol commented Nov 20, 2023

Hey there! Just a heads-up, our server doesn't currently support Tortoise, Bark, or XTTS - it's a bit behind the times. If anyone's up for lending a hand, I'm totally here to guide you through it! 😊👍

@Gnosnay
Copy link

Gnosnay commented Nov 20, 2023

Hey there! Just a heads-up, our server doesn't currently support Tortoise, Bark, or XTTS - it's a bit behind the times. If anyone's up for lending a hand, I'm totally here to guide you through it! 😊👍

hi have one try on this.

it seems that problem is here:

TTS/TTS/utils/manage.py

Lines 407 to 410 in 29dede2

if (
model not in ["tortoise-v2", "bark"] and "fairseq" not in model_name and "xtts" not in model_name
): # TODO:This is stupid but don't care for now.
output_model_path, output_config_path = self._find_files(output_path)

but when i remove this if condition, i met more issues... which is about checkpoint loading on

https://github.com/coqui-ai/TTS/blob/29dede20d336c8250810575fcdcdbbcad8c40a44/TTS/utils/synthesizer.py#L192C35-L192C35

it seems that checkpoint_path should also be given

TTS/TTS/tts/models/xtts.py

Lines 725 to 726 in 29dede2

checkpoint_dir=None,
checkpoint_path=None,

then after i modified all of this, it can be running on my local.

but i found the demo page can not specify the demo wav, and language...

then i got following exception:

 > Model input: zxccx
 > Speaker Idx:
 > Language Idx: en
 > Text splitted to sentences.
['zxccx']
ERROR:server:Exception on /api/tts [GET]
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/flask/app.py", line 1455, in wsgi_app
    response = self.full_dispatch_request()
  File "/usr/local/lib/python3.10/site-packages/flask/app.py", line 869, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/usr/local/lib/python3.10/site-packages/flask/app.py", line 867, in full_dispatch_request
    rv = self.dispatch_request()
  File "/usr/local/lib/python3.10/site-packages/flask/app.py", line 852, in dispatch_request
    return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)
  File "/root/TTS/server/server.py", line 204, in tts
    wavs = synthesizer.tts(text, speaker_name=speaker_idx, language_name=language_idx, style_wav=style_wav)
  File "/root/TTS/utils/synthesizer.py", line 376, in tts
    outputs = self.tts_model.synthesize(
  File "/root/TTS/tts/models/xtts.py", line 392, in synthesize
    return self.inference_with_config(text, config, ref_audio_path=speaker_wav, language=language, **kwargs)
  File "/root/TTS/tts/models/xtts.py", line 414, in inference_with_config
    return self.full_inference(text, ref_audio_path, language, **settings)
  File "/usr/local/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/root/TTS/tts/models/xtts.py", line 475, in full_inference
    (gpt_cond_latent, speaker_embedding) = self.get_conditioning_latents(
  File "/usr/local/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/root/TTS/tts/models/xtts.py", line 351, in get_conditioning_latents
    audio = load_audio(file_path, load_sr)
  File "/root/TTS/tts/models/xtts.py", line 72, in load_audio
    audio, lsr = torchaudio.load(audiopath)
  File "/usr/local/lib/python3.10/site-packages/torchaudio/_backend/utils.py", line 204, in load
    return backend.load(uri, frame_offset, num_frames, normalize, channels_first, format, buffer_size)
  File "/usr/local/lib/python3.10/site-packages/torchaudio/_backend/soundfile.py", line 27, in load
    return soundfile_backend.load(uri, frame_offset, num_frames, normalize, channels_first, format)
  File "/usr/local/lib/python3.10/site-packages/torchaudio/_backend/soundfile_backend.py", line 221, in load
    with soundfile.SoundFile(filepath, "r") as file_:
  File "/usr/local/lib/python3.10/site-packages/soundfile.py", line 658, in __init__
    self._file = self._open(file, mode_int, closefd)
  File "/usr/local/lib/python3.10/site-packages/soundfile.py", line 1212, in _open
    raise TypeError("Invalid file: {0!r}".format(self.name))
TypeError: Invalid file: None

@a-witkowski
Copy link

@erogol, Hello. I have same error. I do not use server.
tts --text "Text for TTS" --model_name tts_models/multilingual/multi-dataset/xtts_v2 --out_path output.wav --language_idx en

 tts_models/multilingual/multi-dataset/xtts_v2 is already downloaded.
 Using model: xtts
 Text: Text for TTS
 Text splitted to sentences.
['Text for TTS']
Traceback (most recent call last):
  File "C:\Users\test\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main  
    return _run_code(code, main_globals, None,
  File "C:\Users\test\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "C:\Users\test\AppData\Local\Programs\Python\Python310\Scripts\tts.exe\__main__.py", line 7, in <module>
  File "D:\tts\TTS\bin\synthesize.py", line 538, in main
    wav = synthesizer.tts(
  File "D:\tts\TTS\utils\synthesizer.py", line 378, in tts
    outputs = self.tts_model.synthesize(
  File "D:\tts\TTS\tts\models\xtts.py", line 392, in synthesize
    return self.inference_with_config(text, config, ref_audio_path=speaker_wav, language=language, **kwargs)
  File "D:\tts\TTS\tts\models\xtts.py", line 414, in inference_with_config
    return self.full_inference(text, ref_audio_path, language, **settings)
  File "C:\Users\test\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "D:\tts\TTS\tts\models\xtts.py", line 475, in full_inference
    (gpt_cond_latent, speaker_embedding) = self.get_conditioning_latents(
  File "C:\Users\test\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "D:\tts\TTS\tts\models\xtts.py", line 351, in get_conditioning_latents
    audio = load_audio(file_path, load_sr)
  File "D:\tts\TTS\tts\models\xtts.py", line 72, in load_audio
    audio, lsr = torchaudio.load(audiopath)
  File "C:\Users\test\AppData\Local\Programs\Python\Python310\lib\site-packages\torchaudio\_backend\utils.py", line 204, in load
    return backend.load(uri, frame_offset, num_frames, normalize, channels_first, format, buffer_size)
  File "C:\Users\test\AppData\Local\Programs\Python\Python310\lib\site-packages\torchaudio\_backend\soundfile.py", line 27, in load
    return soundfile_backend.load(uri, frame_offset, num_frames, normalize, channels_first, format)
  File "C:\Users\test\AppData\Local\Programs\Python\Python310\lib\site-packages\torchaudio\_backend\soundfile_backend.py", line 221, in load
    with soundfile.SoundFile(filepath, "r") as file_:
  File "C:\Users\test\AppData\Local\Programs\Python\Python310\lib\site-packages\soundfile.py", line 658, in __init__
    self._file = self._open(file, mode_int, closefd)
  File "C:\Users\test\AppData\Local\Programs\Python\Python310\lib\site-packages\soundfile.py", line 1212, in _open
    raise TypeError("Invalid file: {0!r}".format(self.name))
TypeError: Invalid file: None

@rkfg
Copy link

rkfg commented Dec 11, 2023

That's not the same error. You need to specify a speaker sample using --speaker_wav /path/to/file.wav, this model can't be used without it IIRC.

Copy link

stale bot commented Jan 10, 2024

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. You might also look our discussion channels.

@stale stale bot added the wontfix This will not be worked on but feel free to help. label Jan 10, 2024
@stale stale bot closed this as completed Jan 29, 2024
@eginhard
Copy link
Contributor

Our fork now supports all Coqui TTS models in the server. Specifying a speaker_wav file is not possible yet, this is tracked in idiap#254.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working wontfix This will not be worked on but feel free to help.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants