Skip to content

feat(speech): Integrated a "SpeechEmotionModel" using HuggingFace {superb/wav2vec2-base-superb-er} with config-driven setup and standardized output schema.#33

Open
VedaSiddhartha010 wants to merge 3 commits intoruxailab:mainfrom
VedaSiddhartha010:feat/speech-emotion-model
Open

feat(speech): Integrated a "SpeechEmotionModel" using HuggingFace {superb/wav2vec2-base-superb-er} with config-driven setup and standardized output schema.#33
VedaSiddhartha010 wants to merge 3 commits intoruxailab:mainfrom
VedaSiddhartha010:feat/speech-emotion-model

Conversation

@VedaSiddhartha010
Copy link
Copy Markdown

@VedaSiddhartha010 VedaSiddhartha010 commented Mar 28, 2026

/// Integrated a "SpeechEmotionModel" for Voice/Speech Analysis for emotion recognition.

What’s Implemented : ----

  • Introduced a new SpeechEmotionModel implemented as a torch.nn.Module.
  • Integrated HuggingFace audio classification pipeline using:
    • superb/wav2vec2-base-superb-er
  • Added configuration support in config.yaml for flexible model and device selection.
  • Designed the model to be fully config-driven, ensuring easy extensibility and maintainability

Output Format : ---

The model returns structured output consistent with the existing pipeline:

{
  "emotion": {
    "label": "...",
    "score": "..."
  }
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant