Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Core system for Metrics #1442

Merged
merged 32 commits into from
Aug 28, 2024
Merged
Show file tree
Hide file tree
Changes from 9 commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
7d7e5f1
Initial attempt at definining a metric
LinasKo Aug 8, 2024
b061f4f
fix(pre_commit): 🎨 auto format pre-commit hooks
pre-commit-ci[bot] Aug 12, 2024
4c903cf
Metrics changes + tests
LinasKo Aug 12, 2024
d6d71a0
fix(pre_commit): 🎨 auto format pre-commit hooks
pre-commit-ci[bot] Aug 12, 2024
6f566b5
Remove incorrect test
LinasKo Aug 12, 2024
4add8c8
fix(pre_commit): 🎨 auto format pre-commit hooks
pre-commit-ci[bot] Aug 12, 2024
0305cee
Simplified tests
LinasKo Aug 12, 2024
3e860a5
Add invalid value test
LinasKo Aug 12, 2024
a4cf260
fix(pre_commit): 🎨 auto format pre-commit hooks
pre-commit-ci[bot] Aug 12, 2024
5565241
Basic mask tests
LinasKo Aug 12, 2024
e1ffd0e
Address some review comments
LinasKo Aug 14, 2024
dabd18c
Remove duplicate box_iou_batch method
LinasKo Aug 14, 2024
b40cddd
Validate class IDs together
LinasKo Aug 14, 2024
758b9d6
Expose class_id, add missing docs
LinasKo Aug 14, 2024
0e8365f
Add dependency: typing-extensions
LinasKo Aug 14, 2024
6675844
Simplified core metrics
LinasKo Aug 15, 2024
6f47cd3
Add to_pandas to IoU
LinasKo Aug 15, 2024
85e0a54
Return dataclass for IoU. plot + as_pandas in result.
LinasKo Aug 15, 2024
ab76a47
Refactored metrics store, removed typing extensions
LinasKo Aug 16, 2024
c31d96d
Merge branch 'develop' into feat/metrics-module-initial
onuralpszr Aug 19, 2024
fe27e44
supervision metrics module
LinasKo Aug 21, 2024
a5bb93d
bugfix: hstack confidence, class_id
LinasKo Aug 26, 2024
d1e5e03
bugfix: incorrect indexing into 0-sized data in data stores
LinasKo Aug 26, 2024
026a243
Give descriptive names to data_1, data_2 in mAP
LinasKo Aug 27, 2024
84bb44d
Metrics docs overhaul
LinasKo Aug 27, 2024
57e5c10
Merge branch 'develop' into feat/metrics-module-initial
LinasKo Aug 27, 2024
ec5790a
Removed print statement internal_data_store
LinasKo Aug 27, 2024
8ff0d31
Removed internal store from mAP
LinasKo Aug 27, 2024
162f9ed
fix mAP for small/medium/large objects
LinasKo Aug 28, 2024
6ac0f84
fix small object area calculation
LinasKo Aug 28, 2024
6bb5929
Remove IoU metric for now - unclear external API
LinasKo Aug 28, 2024
55c8c07
Rename legacy metrics doc to metrics, preserve url
LinasKo Aug 28, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1,046 changes: 601 additions & 445 deletions poetry.lock

Large diffs are not rendered by default.

7 changes: 5 additions & 2 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -51,15 +51,19 @@ scipy = [
matplotlib = ">=3.6.0"
pyyaml = ">=5.3"
defusedxml = "^0.7.1"
pillow = ">=9.4"
opencv-python = { version = ">=4.5.5.64", optional = true }
opencv-python-headless = ">=4.5.5.64"
requests = { version = ">=2.26.0,<=2.32.3", optional = true }
tqdm = { version = ">=4.62.3,<=4.66.5", optional = true }
pillow = ">=9.4"
# pandas: picked lowest major version that supports Python 3.8
pandas = { version = ">=2.0.0", optional = true }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@onuralpszr, do we need to do something extra here?

Copy link
Collaborator

@onuralpszr onuralpszr Aug 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SkalskiP Sorry, I miss this message because was hidden, making optional to add is correct I also see "stub" package which also optional also good. If I may I would like to also say If you need faster metrics and better RAM management, We should strongly consider "polars" as well. I can give you raw example and send you colab example what I meant about it. Plus we can convert to pandas as well. or read from pandas,numpy etc as well. So cards are still open. You saw speed of "ruff" (rust) you can see this one too.

cc : @LinasKo please check as well.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pandas-stubs = { version = ">=2.0.0.230412", optional = true }

[tool.poetry.extras]
desktop = ["opencv-python"]
assets = ["requests", "tqdm"]
metrics = ["pandas", "pandas-stubs"]

[tool.poetry.group.dev.dependencies]
twine = "^5.1.1"
Expand All @@ -79,7 +83,6 @@ docutils = [
{ version = "^0.21.1", python = ">=3.9" },
]


[tool.poetry.group.docs.dependencies]
mkdocs-material = { extras = ["imaging"], version = "^9.5.5" }
mkdocstrings = { extras = ["python"], version = ">=0.20,<0.26" }
Expand Down
2 changes: 2 additions & 0 deletions supervision/metrics/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
from supervision.metrics.core import Metric, MetricTarget, UnsupportedMetricTargetError
from supervision.metrics.intersection_over_union import IntersectionOverUnion
210 changes: 210 additions & 0 deletions supervision/metrics/core.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,210 @@
from __future__ import annotations

from abc import ABC, abstractmethod
from enum import Enum
from typing import Any, Dict, Iterator, Optional, Tuple, Union

import numpy as np
import numpy.typing as npt

from supervision import config
from supervision.detection.core import Detections

"""Used by metrics module as class ID, when none is present"""
CLASS_ID_NONE = -1


class Metric(ABC):
"""
The base class for all supervision metrics.
"""

@abstractmethod
def update(self, *args, **kwargs) -> Metric:
"""
Add data to the metric, without computing the result.
Return the metric itself to allow method chaining, for example:

Example:
```python
result = metric.update(data).compute()
```
"""
raise NotImplementedError

@abstractmethod
def reset(self) -> None:
"""
Reset internal metric state.
"""
raise NotImplementedError

@abstractmethod
def compute(self, *args, **kwargs) -> Any:
"""
Compute the metric from the internal state and return the result.
"""
raise NotImplementedError

# TODO: determine if this is necessary.
# @abstractmethod
# def to_pandas(self, *args, **kwargs) -> Any:
# """
# Return a pandas DataFrame representation of the metric.
# """
# self._ensure_pandas_installed()
# raise NotImplementedError

def _ensure_pandas_installed(self):
try:
import pandas # noqa
except ImportError:
raise ImportError(
"Function `to_pandas` requires the `metrics` extra to be installed."
" Run `pip install 'supervision[metrics]'` or"
" `poetry add supervision -E metrics`."
)


class MetricTarget(Enum):
"""
Specifies what type of detection is used to compute the metric.
"""

BOXES = "boxes"
MASKS = "masks"
ORIENTED_BOUNDING_BOXES = "obb"


class UnsupportedMetricTargetError(Exception):
"""
Raised when a metric does not support the specified target. (and never will!)
If the support might be added in the future, raise `NotImplementedError` instead.
"""

def __init__(self, metric: Metric, target: MetricTarget):
super().__init__(f"Metric {metric} does not support target {target}")


class InternalMetricDataStore:
"""
Stores internal data of IntersectionOverUnion metric:
* Stores the basic data: boxes, masks, or oriented bounding boxes
* Validates data: ensures data types and shape are consistent
* Provides iteration by class
"""

def __init__(self, metric_target: MetricTarget, class_agnostic: bool):
self._metric_target = metric_target
self._class_agnostic = class_agnostic
self._data_1: Dict[int, npt.NDArray]
self._data_2: Dict[int, npt.NDArray]
self._datapoint_shape: Optional[Tuple[int, ...]]
self.reset()

def reset(self) -> None:
self._data_1 = {}
self._data_2 = {}
if self._metric_target == MetricTarget.BOXES:
self._datapoint_shape = (4,)
elif self._metric_target == MetricTarget.MASKS:
# Determined when adding data
self._datapoint_shape = None
elif self._metric_target == MetricTarget.ORIENTED_BOUNDING_BOXES:
self._datapoint_shape = (8,)

def update(
self,
data_1: Union[npt.NDArray, Detections],
data_2: Union[npt.NDArray, Detections],
) -> None:
content_1 = self._get_content(data_1)
content_2 = self._get_content(data_2)
class_ids_1 = self._get_class_ids(data_1)
class_ids_2 = self._get_class_ids(data_2)
self._validate_class_ids(class_ids_1)
self._validate_class_ids(class_ids_2)
if content_1 is not None and len(content_1) > 0:
assert len(content_1) == len(class_ids_1)
for class_id in set(class_ids_1):
content_of_class = content_1[class_ids_1 == class_id]
if class_id not in self._data_1:
self._data_1[class_id] = content_of_class
continue
self._data_1[class_id] = np.vstack(
(self._data_1[class_id], content_of_class)
)

if content_2 is not None and len(content_2) > 0:
assert len(content_2) == len(class_ids_2)
for class_id in set(class_ids_2):
content_of_class = content_2[class_ids_2 == class_id]
if class_id not in self._data_2:
self._data_2[class_id] = content_of_class
continue
self._data_2[class_id] = np.vstack(
(self._data_2[class_id], content_of_class)
)

def __iter__(
self,
) -> Iterator[Tuple[int, Optional[npt.NDArray], Optional[npt.NDArray]]]:
class_ids = sorted(
set.union(set(self._data_1.keys()), set(self._data_2.keys()))
)
for class_id in class_ids:
yield (
class_id,
self._data_1.get(class_id, None),
self._data_2.get(class_id, None),
)

def _get_content(
self, data: Union[npt.NDArray, Detections]
) -> Optional[npt.NDArray]:
"""Return boxes, masks or oriented bounding boxes from the data."""
if not isinstance(data, (Detections, np.ndarray)):
raise ValueError(
f"Invalid data type: {type(data)}. Only Detections or np.ndarray are supported."
)
if isinstance(data, np.ndarray):
return data

if self._metric_target == MetricTarget.BOXES:
return data.xyxy
if self._metric_target == MetricTarget.MASKS:
return data.mask
if self._metric_target == MetricTarget.ORIENTED_BOUNDING_BOXES:
obb = data.data.get(config.ORIENTED_BOX_COORDINATES, None)
if isinstance(obb, list):
obb = np.array(obb, dtype=np.float32)
return obb
raise ValueError(f"Invalid metric target: {self._metric_target}")

def _get_class_ids(
self, data: Union[npt.NDArray, Detections]
) -> npt.NDArray[np.int_]:
if self._class_agnostic or isinstance(data, np.ndarray):
return np.array([CLASS_ID_NONE] * len(data), dtype=int)
assert isinstance(data, Detections)
if data.class_id is None:
return np.array([CLASS_ID_NONE] * len(data), dtype=int)
return data.class_id

def _validate_class_ids(self, class_id: npt.NDArray[np.int_]) -> None:
class_set = set(class_id)
if len(class_set) >= 2 and -1 in class_set:
raise ValueError(
"Metrics store received results with partially defined classes."
)

def _validate_shape(self, data: npt.NDArray) -> None:
if self._datapoint_shape is None:
assert self._metric_target == MetricTarget.MASKS
self._datapoint_shape = data.shape[1:]
return
if data.shape[1:] != self._datapoint_shape:
raise ValueError(
f"Invalid data shape: {data.shape}."
f" Expected: (N, {self._datapoint_shape})"
)
136 changes: 136 additions & 0 deletions supervision/metrics/intersection_over_union.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,136 @@
from typing import TYPE_CHECKING, Dict, Union

import numpy as np
import numpy.typing as npt

from supervision.detection.core import Detections
from supervision.metrics.core import InternalMetricDataStore, Metric, MetricTarget

if TYPE_CHECKING:
pass


class IntersectionOverUnion(Metric):
def __init__(
self,
metric_target: MetricTarget = MetricTarget.BOXES,
class_agnostic: bool = False,
iou_threshold: float = 0.25,
):
# TODO: implement for masks and oriented bounding boxes
if metric_target in [MetricTarget.MASKS, MetricTarget.ORIENTED_BOUNDING_BOXES]:
raise NotImplementedError(
f"Intersection over union is not implemented for {metric_target}."
)

self._metric_target = metric_target
self._class_agnostic = class_agnostic
self._iou_threshold = iou_threshold

self._store = InternalMetricDataStore(metric_target, class_agnostic)

def reset(self) -> None:
self._store.reset()

def update(
self,
data_1: Union[npt.NDArray, Detections],
data_2: Union[npt.NDArray, Detections],
) -> Metric:
"""
Add data to the metric, without computing the result.

The arguments can be:
* Boxes of shape (N, 4), float32,
* Masks of shape (N, H, W), bool
* Oriented bounding boxes of shape (N, 8), float32.
* Detections object.

Args:
data_1 (Union[npt.NDArray, Detection]): The first set of data.
data_2 (Union[npt.NDArray, Detection]): The second set of data.

Returns:
Metric: The metric object itself. You can get the metric result
by calling the `compute` method.
"""
self._store.update(data_1, data_2)
return self

def compute(self) -> Dict[int, npt.NDArray[np.float32]]:
"""
Compute the Intersection over Union metric (IoU)
Uses the data set with the `update` method.

Returns:
Dict[int, npt.NDArray[np.float32]]: A dictionary with class IDs as keys.
If no class ID is provided, the key is the value CLASS_ID_NONE.
"""
# TODO: cache computed result.
ious = {}
for class_id, array_1, array_2 in self._store:
if self._metric_target == MetricTarget.BOXES:
if array_1 is None or array_2 is None:
ious[class_id] = np.empty((0, 4), dtype=np.float32)
continue
iou = self._compute_box_iou(array_1, array_2)

else:
raise NotImplementedError(
"Intersection over union is not implemented"
" for {self._metric_target}."
)
ious[class_id] = iou
return ious

# TODO: This would return dict[int, pd.DataFrame]. Do we want that?
# It'd be cleaner if it returned a single DataFrame, but the sizes
# differ if class_agnostic=False.

# def to_pandas(self) -> 'pd.DataFrame':
# """
# Return a pandas DataFrame representation of the metric.
# """
# self._ensure_pandas_installed()
# import pandas as pd

# # TODO: use cached results
# ious = self.compute()
# print(len(ious))

# class_names = []
# arrays = []

# for class_id, array in ious.items():
# print(array.shape)
# class_names.append(np.full(array.shape[0], class_id))
# arrays.append(array)
# stacked_class_ids = np.concatenate(class_names)
# stacked_ious = np.vstack(arrays)
# combined = np.column_stack((stacked_class_ids, stacked_ious))

# column_names = \
# ['class_id'] + [f'col_{i+1}' for i in range(stacked_ious.shape[1])]
# result = pd.DataFrame(combined, columns=column_names)

# return result

@staticmethod
def _compute_box_iou(
array_1: npt.NDArray, array_2: npt.NDArray
) -> npt.NDArray[np.float32]:
"""Computes the pairwise intersection-over-union between two sets of boxes."""

def box_area(box):
return (box[2] - box[0]) * (box[3] - box[1])

area_true = box_area(array_1.T)
area_detection = box_area(array_2.T)

top_left = np.maximum(array_1[:, None, :2], array_2[:, :2])
bottom_right = np.minimum(array_1[:, None, 2:], array_2[:, 2:])

area_inter = np.prod(np.clip(bottom_right - top_left, a_min=0, a_max=None), 2)
ious = area_inter / (area_true[:, None] + area_detection - area_inter)
ious = np.nan_to_num(ious)
return ious
Loading