scorer.py does not support event detection algorithms
Summary
scorer.py is built around comparing two continuous, time-aligned signals. This makes it unsuitable for evaluating algorithms whose output is a set of discrete events rather than a contiguous data stream (e.g., sit-to-stand transition detection, step detection, gesture recognition).
Current Behavior
scorer.py via resample() expects two CSV files, each containing:
- A column of continuous numeric values (e.g., heart rate, SpO2)
- A column of timestamps
It trims both files to their overlapping time range, resamples both to a common frequency using ebsig.periodize(), and then computes point-to-point correlation statistics (Pearson r, Spearman ρ, Kendall τ) and a Bland-Altman mean difference plot.
This pipeline assumes:
- The algorithm output is a dense, regularly-sampled signal
- Ground truth is also a dense, regularly-sampled signal from a reference device
- The two signals can be meaningfully compared value-by-value at each timestep
Problem
Some algorithms produce discrete event outputs — a list of detected timestamps or intervals — rather than a continuous signal. Examples include:
- Sit-to-stand / stand-to-sit transition detection
- Step detection
- Fall detection
- Gesture or posture classification
For these algorithms:
- There is no continuous stream to resample or correlate
- Ground truth is typically an annotation file (e.g., a JSON with labeled start/end times), not a reference device CSV
- The appropriate metrics are precision, recall, and F1 score, computed by matching detected events to ground-truth events within some time tolerance window
Attempting to use scorer.py for this class of algorithm is not possible without significant changes, and the current README implies scorer.py is the standard evaluation path for all algorithms.
Proposed Solution
Add support for event detection evaluation, either as:
-
A new script (e.g., event_scorer.py) that:
- Accepts a ground-truth annotation JSON (list of labeled intervals with
start_time / end_time)
- Accepts an algorithm output file (list of detected event timestamps or intervals)
- Matches detections to ground-truth events within a configurable time tolerance
- Reports per-class and overall precision, recall, and F1
- Optionally generates a timeline visualization overlaying detections and ground truth
-
Or an extension to scorer.py with a new score_events() function that handles this case
The README's "evaluate a new algorithm" section should also be updated to describe both evaluation paths — continuous signal comparison and event detection — so users know which tool to use for their algorithm type.
scorer.pydoes not support event detection algorithmsSummary
scorer.pyis built around comparing two continuous, time-aligned signals. This makes it unsuitable for evaluating algorithms whose output is a set of discrete events rather than a contiguous data stream (e.g., sit-to-stand transition detection, step detection, gesture recognition).Current Behavior
scorer.pyviaresample()expects two CSV files, each containing:It trims both files to their overlapping time range, resamples both to a common frequency using
ebsig.periodize(), and then computes point-to-point correlation statistics (Pearson r, Spearman ρ, Kendall τ) and a Bland-Altman mean difference plot.This pipeline assumes:
Problem
Some algorithms produce discrete event outputs — a list of detected timestamps or intervals — rather than a continuous signal. Examples include:
For these algorithms:
Attempting to use
scorer.pyfor this class of algorithm is not possible without significant changes, and the current README impliesscorer.pyis the standard evaluation path for all algorithms.Proposed Solution
Add support for event detection evaluation, either as:
A new script (e.g.,
event_scorer.py) that:start_time/end_time)Or an extension to
scorer.pywith a newscore_events()function that handles this caseThe README's "evaluate a new algorithm" section should also be updated to describe both evaluation paths — continuous signal comparison and event detection — so users know which tool to use for their algorithm type.