Skip to content

Conversation

luciehmct
Copy link

Summary

  • Adds the MammAlps evaluation suite, with three subtasks (animal, action, activity recognition) grounded in alpine camera-trap footage.
  • Provides a _cot.jsonl–based dataset_builder.py that produces Hugging Face–ready datasets (animalkingdom, mammalnet, or mammalps) with unified splits.
  • Uses shared utilities (mammalps_doc_to_visual, mammalps_doc_to_text, mammalps_doc_to_target, mammalps_process_results) and strict Jaccard scoring with aggregation for reproducible results.
  • Documents InternVL3 video evaluation defaults (OpenGVLab/InternVL3-8B, batch size 1, num_frame=32, use_temporal_context=True).

Details

  • Dataset specifics: MammAlps contains Swiss National Park wildlife clips annotated for species, fine-grained actions, and higher-level activities. The builder can output unified datasets with consistent directory structures and Hugging Face JSON records.
  • Evaluation flow:
    • Each subtask config loads clips from luciehmct/mammalps.
    • Predictions are parsed with the “Final answer: [...]” format.
    • Per-example logs (prompt, response, parsed labels, ground truth, Jaccard score) are stored under results/<model>_<timestamp>/mammalps_<subtask>.jsonl.
    • The global Jaccard metric (in lmms_eval/api/metrics.py) computes strict overlap before mean aggregation.
  • InternVL3 integration: Frames are timestamped when use_temporal_context=True, giving the model richer temporal cues.

Testing

# Action recognition
python -m lmms_eval \
  --model internvl3 \
  --model_args "pretrained=OpenGVLab/InternVL3-8B,modality=video,num_frame=32,use_temporal_context=True" \
  --tasks mammalps_action \
  --batch_size 1 \
  --output_path "$OUT_DIR"

# Run all three MammAlps subtasks together
python -m lmms_eval \
  --model internvl3 \
  --model_args "pretrained=OpenGVLab/InternVL3-8B,modality=video,num_frame=32,use_temporal_context=True" \
  --tasks mammalps \
  --batch_size 1 \
  --output_path "$OUT_DIR"

Comment on lines +510 to +521
# Create model_used_date_time directory structure in results directory
import datetime

timestamp = datetime.datetime.now().strftime("%Y%m%d_%H%M")
model_name = "InternVL3-8B" # Can be made configurable if needed

# Use results directory in the lmms-eval repository
results_base_dir = os.path.join(os.getcwd(), "results")
output_dir = os.path.join(results_base_dir, f"{model_name}_{timestamp}")

# Create directory if it doesn't exist
os.makedirs(output_dir, exist_ok=True)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part is hardcoded

Copy link
Collaborator

@kcz358 kcz358 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, Thank you for the PR! I think the PR include the PR changes from your previous PR. Do you guys want to merge all the changes in one PR or you want to merge it separately

@luciehmct luciehmct closed this Oct 6, 2025
@luciehmct
Copy link
Author

Putting this PR on hold; will revisit and reopen later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants