Labels: enhancement gsoc-2026 multimodal
Depends on: #1988
Text module
What it should do
Take a string of text - a participant's transcribed comment, their answer to a task question, and return a SentimentOutput with modality: "text".
Intent detection (optional but valuable)
If time allows, it would be useful to also detect intent from the text and not just what emotion the participant expressed, but whether they're making a complaint, a suggestion, asking a question, or describing confusion. This doesn't need a separate model, a simple keyword/pattern approach is fine for now. The intent label can feed into the usability mapping the same way an emotion label does.
Audio module
What it should do
Take a path to an audio file (WAV or MP3, short clip from a usability session recording) and return a SentimentOutput with modality: "audio".