Skip to content

Prototype: Standardized Sentiment and Emotion Output Schema#22

Open
aadi-joshi wants to merge 6 commits intoruxailab:mainfrom
aadi-joshi:feature/standardized-output-schema
Open

Prototype: Standardized Sentiment and Emotion Output Schema#22
aadi-joshi wants to merge 6 commits intoruxailab:mainfrom
aadi-joshi:feature/standardized-output-schema

Conversation

@aadi-joshi
Copy link
Copy Markdown

This PR proposes a simple standardized output format for sentiment / emotion results.

Right now different parts of the ruxailab stack return slightly different structures. for example the text sentiment endpoint returns POS/NEG/NEU + confidence, while the facial sentiment API returns emotion percentages. that makes it harder to store or compare results across modalities.

This adds a small prototype for a shared schema that could represent outputs from text sentiment, audio transcript sentiment, and facial emotion analysis.

Included in this PR:

  • docs/emotion_output_schema_v1.md
    proposal for a common JSON structure + field descriptions

  • normalization/schema.py
    pydantic models for validating the schema

  • normalization/normalizer.py
    helper functions to convert existing outputs into the schema (text sentiment, audio pipeline output, facial emotion percentages)

  • examples/audio_sentiment_standardized_example.json
    example output showing segment-level audio sentiment

  • tests/unit/test_normalization.py
    tests for the schema + normalization helpers

All changes are additive and existing endpoints are untouched.

Mainly opening this as a starting point for discussing how sentiment/emotion outputs could be standardized across services. If there's already a preferred structure for outputs like this in the project, happy to align this with it.

Define a unified JSON schema for sentiment and emotion analysis results
across text, audio, and facial modalities. Includes field descriptions,
label normalization mappings, example outputs for each modality, and
compatibility notes for the RUXAILAB frontend and Firestore storage.
Introduce sentiment_normalization package containing:
- schema.py: Pydantic models (StandardizedOutput, ResultEntry, Segment)
  that enforce the v1 output schema with type validation
- normalizer.py: converter functions for text sentiment (BERTweet),
  audio pipeline outputs, and facial emotion percentages

Label normalization maps model-specific labels (POS/NEG/NEU, Angry/Happy/etc.)
to a consistent lowercase vocabulary. Scores from 0-100 ranges are
automatically converted to 0.0-1.0.
25 tests covering schema validation, label mapping, text sentiment
normalization, audio pipeline conversion, facial emotion conversion,
edge cases (unknown labels, long text truncation, error utterance
skipping, percentage-to-ratio conversion), and label map completeness.
JSON file showing how a multi-utterance audio transcript sentiment
result looks when converted to the v1 standardized schema, with
segment timestamps, normalized labels, and a task identifier.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant