Skip to content

Add standardized eye-tracking benchmark evaluation module and API end…#80

Open
Sahil-aka wants to merge 1 commit intoruxailab:mainfrom
Sahil-aka:feature/benchmark-evaluation
Open

Add standardized eye-tracking benchmark evaluation module and API end…#80
Sahil-aka wants to merge 1 commit intoruxailab:mainfrom
Sahil-aka:feature/benchmark-evaluation

Conversation

@Sahil-aka
Copy link
Copy Markdown

Summary

This PR introduces a standardized benchmarking module for evaluating eye-tracking system performance.

"It provides structured accuracy, precision, and data quality metrics aligned with eye-tracking validation and benchmarking best practices."

Motivation

Currently, the project does not provide a reproducible and standardized way to evaluate eye-tracking accuracy and precision across sessions or devices.

This implementation enables:

Consistent benchmark evaluation

Scientific RMS-based precision measurement

Per-target accuracy analysis

Reproducible performance comparison across setups

Key Additions

Accuracy metrics (mean, median, p95) in pixels and degrees

Scientifically grounded RMS precision metric

Data quality reporting (sample count, loss %)

Per-target accuracy breakdown

New API endpoint:

POST /api/session/benchmark

Input validation and NumPy-safe JSON serialization

Expected Behavior

POST /api/session/benchmark

The API returns:

Overall benchmark metrics

Per-target accuracy breakdown

Precision metrics

Data quality metrics

All returned values are JSON-serializable and numerically stable.

image image

@Sahil-aka
Copy link
Copy Markdown
Author

Thanks for the earlier feedback.

This update includes:

• Refactored benchmark endpoint to be session-based (/api/session/<session_id>/benchmark)
• Metrics now use session metadata instead of request parameters
• Added minimum sample validation warning for small datasets
• Added optional benchmark PDF report generation endpoint

Happy to adjust further if this should align differently with the project architecture.

@Sahil-aka Sahil-aka force-pushed the feature/benchmark-evaluation branch from 89e4ae2 to 8d15735 Compare March 31, 2026 14:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant