-
Notifications
You must be signed in to change notification settings - Fork 402
MammAlps PR Description #832
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…n on alpine wildlife videos - Implements animal, action, and activity recognition subtasks for the MammAlps dataset - Includes dataset builder, YAML configs, and strict Jaccard metric evaluation - Utilities for prompt/answer extraction, result processing, and HuggingFace video download - See mammalps/README.md for details and usage instructions
| # Create model_used_date_time directory structure in results directory | ||
| import datetime | ||
|
|
||
| timestamp = datetime.datetime.now().strftime("%Y%m%d_%H%M") | ||
| model_name = "InternVL3-8B" # Can be made configurable if needed | ||
|
|
||
| # Use results directory in the lmms-eval repository | ||
| results_base_dir = os.path.join(os.getcwd(), "results") | ||
| output_dir = os.path.join(results_base_dir, f"{model_name}_{timestamp}") | ||
|
|
||
| # Create directory if it doesn't exist | ||
| os.makedirs(output_dir, exist_ok=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This part is hardcoded
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, Thank you for the PR! I think the PR include the PR changes from your previous PR. Do you guys want to merge all the changes in one PR or you want to merge it separately
|
Putting this PR on hold; will revisit and reopen later. |
Summary
_cot.jsonl–based dataset_builder.py that produces Hugging Face–ready datasets (animalkingdom,mammalnet, ormammalps) with unified splits.mammalps_doc_to_visual,mammalps_doc_to_text,mammalps_doc_to_target,mammalps_process_results) and strict Jaccard scoring with aggregation for reproducible results.num_frame=32,use_temporal_context=True).Details
luciehmct/mammalps.results/<model>_<timestamp>/mammalps_<subtask>.jsonl.lmms_eval/api/metrics.py) computes strict overlap before mean aggregation.use_temporal_context=True, giving the model richer temporal cues.Testing