Skip to content

Preprocessing #1

@WhiteChy

Description

@WhiteChy

Thank you for open-sourcing the Conan project!
I’ve been trying to reproduce the results following the paper and the released code, but I’m still unable to achieve the demo-level quality. Specifically:

Emformer accuracy: 63%

Generated waveform: The output from the main Conan model sounds noticeably degraded, as shown in the following training loss curve:

Image

It seems that the only missing component from the repository is the data preprocessing stage (particularly the HuBERT token extraction). I suspect that my HuBERT tokens might be the issue.

Would you consider releasing the preprocessing scripts or providing a minimal example for HuBERT token generation?
Any guidance or clarification would be greatly appreciated!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions