Skip to content

Commit

Permalink
Project import generated by Copybara.
Browse files Browse the repository at this point in the history
GitOrigin-RevId: 2146b10f0a498f665f246e16033b686c7947b92d
  • Loading branch information
MediaPipe Team authored and chuoling committed May 10, 2021
1 parent a9b643e commit 017c1dc
Show file tree
Hide file tree
Showing 52 changed files with 708 additions and 298 deletions.
3 changes: 3 additions & 0 deletions MANIFEST.in
Original file line number Diff line number Diff line change
Expand Up @@ -14,3 +14,6 @@ exclude mediapipe/modules/objectron/object_detection_3d_sneakers.tflite
exclude mediapipe/modules/objectron/object_detection_3d_chair.tflite
exclude mediapipe/modules/objectron/object_detection_3d_camera.tflite
exclude mediapipe/modules/objectron/object_detection_3d_cup.tflite
exclude mediapipe/modules/objectron/object_detection_ssd_mobilenetv2_oidv4_fp16.tflite
exclude mediapipe/modules/pose_landmark/pose_landmark_lite.tflite
exclude mediapipe/modules/pose_landmark/pose_landmark_heavy.tflite
9 changes: 9 additions & 0 deletions docs/framework_concepts/framework_concepts.md
Original file line number Diff line number Diff line change
Expand Up @@ -110,3 +110,12 @@ Other policies are also available, implemented using a separate kind of
component known as an InputStreamHandler.

See [Synchronization](synchronization.md) for more details.

### Realtime data streams

MediaPipe calculator graphs are often used to process streams of video or audio
frames for interactive applications. Normally, each Calculator runs as soon as
all of its input packets for a given timestamp become available. Calculators
used in realtime graphs need to define output timestamp bounds based on input
timestamp bounds in order to allow downstream calculators to be scheduled
promptly. See [Realtime data streams](realtime.md) for details.
187 changes: 187 additions & 0 deletions docs/framework_concepts/realtime.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,187 @@
---
layout: default
title: Processing real-time data streams
nav_order: 6
has_children: true
has_toc: false
---

# Processing real-time data streams
{: .no_toc }

1. TOC
{:toc}
---

## Realtime timestamps

MediaPipe calculator graphs are often used to process streams of video or audio
frames for interactive applications. The MediaPipe framework requires only that
successive packets be assigned monotonically increasing timestamps. By
convention, realtime calculators and graphs use the recording time or the
presentation time of each frame as its timestamp, with each timestamp indicating
the microseconds since `Jan/1/1970:00:00:00`. This allows packets from various
sources to be processed in a globally consistent sequence.

## Realtime scheduling

Normally, each Calculator runs as soon as all of its input packets for a given
timestamp become available. Normally, this happens when the calculator has
finished processing the previous frame, and each of the calculators producing
its inputs have finished processing the current frame. The MediaPipe scheduler
invokes each calculator as soon as these conditions are met. See
[Synchronization](synchronization.md) for more details.

## Timestamp bounds

When a calculator does not produce any output packets for a given timestamp, it
can instead output a "timestamp bound" indicating that no packet will be
produced for that timestamp. This indication is necessary to allow downstream
calculators to run at that timestamp, even though no packet has arrived for
certain streams for that timestamp. This is especially important for realtime
graphs in interactive applications, where it is crucial that each calculator
begin processing as soon as possible.

Consider a graph like the following:

```
node {
calculator: "A"
input_stream: "alpha_in"
output_stream: "alpha"
}
node {
calculator: "B"
input_stream: "alpha"
input_stream: "foo"
output_stream: "beta"
}
```

Suppose: at timestamp `T`, node `A` doesn't send a packet in its output stream
`alpha`. Node `B` gets a packet in `foo` at timestamp `T` and is waiting for a
packet in `alpha` at timestamp `T`. If `A` doesn't send `B` a timestamp bound
update for `alpha`, `B` will keep waiting for a packet to arrive in `alpha`.
Meanwhile, the packet queue of `foo` will accumulate packets at `T`, `T+1` and
so on.

To output a packet on a stream, a calculator uses the API functions
`CalculatorContext::Outputs` and `OutputStream::Add`. To instead output a
timestamp bound on a stream, a calculator can use the API functions
`CalculatorContext::Outputs` and `CalculatorContext::SetNextTimestampBound`. The
specified bound is the lowest allowable timestamp for the next packet on the
specified output stream. When no packet is output, a calculator will typically
do something like:

```
cc->Outputs().Tag("output_frame").SetNextTimestampBound(
cc->InputTimestamp().NextAllowedInStream());
```

The function `Timestamp::NextAllowedInStream` returns the successive timestamp.
For example, `Timestamp(1).NextAllowedInStream() == Timestamp(2)`.

## Propagating timestamp bounds

Calculators that will be used in realtime graphs need to define output timestamp
bounds based on input timestamp bounds in order to allow downstream calculators
to be scheduled promptly. A common pattern is for calculators to output packets
with the same timestamps as their input packets. In this case, simply outputting
a packet on every call to `Calculator::Process` is sufficient to define output
timestamp bounds.

However, calculators are not required to follow this common pattern for output
timestamps, they are only required to choose monotonically increasing output
timestamps. As a result, certain calculators must calculate timestamp bounds
explicitly. MediaPipe provides several tools for computing appropriate timestamp
bound for each calculator.

1\. **SetNextTimestampBound()** can be used to specify the timestamp bound, `t +
1`, for an output stream.

```
cc->Outputs.Tag("OUT").SetNextTimestampBound(t.NextAllowedInStream());
```

Alternatively, an empty packet with timestamp `t` can be produced to specify the
timestamp bound `t + 1`.

```
cc->Outputs.Tag("OUT").Add(Packet(), t);
```

The timestamp bound of an input stream is indicated by the packet or the empty
packet on the input stream.

```
Timestamp bound = cc->Inputs().Tag("IN").Value().Timestamp();
```

2\. **TimestampOffset()** can be specified in order to automatically copy the
timestamp bound from input streams to output streams.

```
cc->SetTimestampOffset(0);
```

This setting has the advantage of propagating timestamp bounds automatically,
even when only timestamp bounds arrive and Calculator::Process is not invoked.

3\. **ProcessTimestampBounds()** can be specified in order to invoke
`Calculator::Process` for each new "settled timestamp", where the "settled
timestamp" is the new highest timestamp below the current timestamp bounds.
Without `ProcessTimestampBounds()`, `Calculator::Process` is invoked only with
one or more arriving packets.

```
cc->SetProcessTimestampBounds(true);
```

This setting allows a calculator to perform its own timestamp bounds calculation
and propagation, even when only input timestamps are updated. It can be used to
replicate the effect of `TimestampOffset()`, but it can also be used to
calculate a timestamp bound that takes into account additional factors.

For example, in order to replicate `SetTimestampOffset(0)`, a calculator could
do the following:

```
absl::Status Open(CalculatorContext* cc) {
cc->SetProcessTimestampBounds(true);
}
absl::Status Process(CalculatorContext* cc) {
cc->Outputs.Tag("OUT").SetNextTimestampBound(
cc->InputTimestamp().NextAllowedInStream());
}
```

## Scheduling of Calculator::Open and Calculator::Close

`Calculator::Open` is invoked when all required input side-packets have been
produced. Input side-packets can be provided by the enclosing application or by
"side-packet calculators" inside the graph. Side-packets can be specified from
outside the graph using the API's `CalculatorGraph::Initialize` and
`CalculatorGraph::StartRun`. Side packets can be specified by calculators within
the graph using `CalculatorGraphConfig::OutputSidePackets` and
`OutputSidePacket::Set`.

Calculator::Close is invoked when all of the input streams have become `Done` by
being closed or reaching timestamp bound `Timestamp::Done`.

**Note:** If the graph finishes all pending calculator execution and becomes
`Done`, before some streams become `Done`, then MediaPipe will invoke the
remaining calls to `Calculator::Close`, so that every calculator can produce its
final outputs.

The use of `TimestampOffset` has some implications for `Calculator::Close`. A
calculator specifying `SetTimestampOffset(0)` will by design signal that all of
its output streams have reached `Timestamp::Done` when all of its input streams
have reached `Timestamp::Done`, and therefore no further outputs are possible.
This prevents such a calculator from emitting any packets during
`Calculator::Close`. If a calculator needs to produce a summary packet during
`Calculator::Close`, `Calculator::Process` must specify timestamp bounds such
that at least one timestamp (such as `Timestamp::Max`) remains available during
`Calculator::Close`. This means that such a calculator normally cannot rely upon
`SetTimestampOffset(0)` and must instead specify timestamp bounds explicitly
using `SetNextTimestampBounds()`.
Binary file added docs/images/mobile/pose_tracking_pck_chart.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
38 changes: 27 additions & 11 deletions docs/solutions/pose.md
Original file line number Diff line number Diff line change
Expand Up @@ -79,19 +79,32 @@ to visualize its associated subgraphs, please see
## Pose Estimation Quality

To evaluate the quality of our [models](./models.md#pose) against other
well-performing publicly available solutions, we use a validation dataset,
consisting of 1k images with diverse Yoga, HIIT, and Dance postures. Each image
well-performing publicly available solutions, we use three different validation
datasets, representing different verticals: Yoga, Dance and HIIT. Each image
contains only a single person located 2-4 meters from the camera. To be
consistent with other solutions, we perform evaluation only for 17 keypoints
from [COCO topology](https://cocodataset.org/#keypoints-2020).

Method | [mAP](https://cocodataset.org/#keypoints-eval) | [[email protected]](https://github.com/cbsudux/Human-Pose-Estimation-101) | [FPS](https://en.wikipedia.org/wiki/Frame_rate), Pixel 3 [TFLite GPU](https://www.tensorflow.org/lite/performance/gpu_advanced) | [FPS](https://en.wikipedia.org/wiki/Frame_rate), MacBook Pro (15-inch, 2017)
----------------------------------------------------------------------------------------------------- | ---------------------------------------------: | --------------------------------------------------------------: | ------------------------------------------------------------------------------------------------------------------------------: | ---------------------------------------------------------------------------:
BlazePose.Lite | 49.1 | 91.7 | 49 | 40
BlazePose.Full | 64.5 | 95.8 | 40 | 37
BlazePose.Heavy | 70.9 | 97.0 | 19 | 26
[AlphaPose.ResNet50](https://github.com/MVIG-SJTU/AlphaPose) | 57.6 | 93.1 | N/A | N/A
[Apple Vision](https://developer.apple.com/documentation/vision/detecting_human_body_poses_in_images) | 37.0 | 85.3 | N/A | N/A
Method | Yoga <br/> [`mAP`] | Yoga <br/> [`[email protected]`] | Dance <br/> [`mAP`] | Dance <br/> [`[email protected]`] | HIIT <br/> [`mAP`] | HIIT <br/> [`[email protected]`]
----------------------------------------------------------------------------------------------------- | -----------------: | ---------------------: | ------------------: | ----------------------: | -----------------: | ---------------------:
BlazePose.Heavy | 68.1 | **96.4** | 73.0 | **97.2** | 74.0 | **97.5**
BlazePose.Full | 62.6 | **95.5** | 67.4 | **96.3** | 68.0 | **95.7**
BlazePose.Lite | 45.0 | **90.2** | 53.6 | **92.5** | 53.8 | **93.5**
[AlphaPose.ResNet50](https://github.com/MVIG-SJTU/AlphaPose) | 63.4 | **96.0** | 57.8 | **95.5** | 63.4 | **96.0**
[Apple.Vision](https://developer.apple.com/documentation/vision/detecting_human_body_poses_in_images) | 32.8 | **82.7** | 36.4 | **91.4** | 44.5 | **88.6**

![pose_tracking_pck_chart.png](../images/mobile/pose_tracking_pck_chart.png) |
:--------------------------------------------------------------------------: |
*Fig 2. Quality evaluation in [`[email protected]`].* |

We designed our models specifically for live perception use cases, so all of
them work in real-time on the majority of modern devices.

Method | Latency <br/> Pixel 3 [TFLite GPU](https://www.tensorflow.org/lite/performance/gpu_advanced) | Latency <br/> MacBook Pro (15-inch 2017)
--------------- | -------------------------------------------------------------------------------------------: | ---------------------------------------:
BlazePose.Heavy | 53 ms | 38 ms
BlazePose.Full | 25 ms | 27 ms
BlazePose.Lite | 20 ms | 25 ms

## Models

Expand All @@ -109,7 +122,7 @@ hip midpoints.

![pose_tracking_detector_vitruvian_man.png](../images/mobile/pose_tracking_detector_vitruvian_man.png) |
:----------------------------------------------------------------------------------------------------: |
*Fig 2. Vitruvian man aligned via two virtual keypoints predicted by BlazePose detector in addition to the face bounding box.* |
*Fig 3. Vitruvian man aligned via two virtual keypoints predicted by BlazePose detector in addition to the face bounding box.* |

### Pose Landmark Model (BlazePose GHUM 3D)

Expand All @@ -124,7 +137,7 @@ this [paper](https://arxiv.org/abs/2006.10204) and

![pose_tracking_full_body_landmarks.png](../images/mobile/pose_tracking_full_body_landmarks.png) |
:----------------------------------------------------------------------------------------------: |
*Fig 3. 33 pose landmarks.* |
*Fig 4. 33 pose landmarks.* |

## Solution APIs

Expand Down Expand Up @@ -384,3 +397,6 @@ on how to build MediaPipe examples.
* [Models and model cards](./models.md#pose)
* [Web demo](https://code.mediapipe.dev/codepen/pose)
* [Python Colab](https://mediapipe.page.link/pose_py_colab)

[`mAP`]: https://cocodataset.org/#keypoints-eval
[`[email protected]`]: https\://github.com/cbsudux/Human-Pose-Estimation-101
16 changes: 16 additions & 0 deletions mediapipe/calculators/core/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -233,6 +233,22 @@ cc_test(
],
)

cc_library(
name = "concatenate_vector_calculator_hdr",
hdrs = ["concatenate_vector_calculator.h"],
visibility = ["//visibility:public"],
deps = [
":concatenate_vector_calculator_cc_proto",
"//mediapipe/framework:calculator_framework",
"//mediapipe/framework/api2:node",
"//mediapipe/framework/api2:port",
"//mediapipe/framework/port:integral_types",
"//mediapipe/framework/port:ret_check",
"//mediapipe/framework/port:status",
],
alwayslink = 1,
)

cc_library(
name = "concatenate_vector_calculator",
srcs = ["concatenate_vector_calculator.cc"],
Expand Down
3 changes: 2 additions & 1 deletion mediapipe/calculators/core/default_side_packet_calculator.cc
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,8 @@ absl::Status DefaultSidePacketCalculator::GetContract(CalculatorContract* cc) {
if (cc->InputSidePackets().HasTag(kOptionalValueTag)) {
cc->InputSidePackets()
.Tag(kOptionalValueTag)
.SetSameAs(&cc->InputSidePackets().Tag(kDefaultValueTag));
.SetSameAs(&cc->InputSidePackets().Tag(kDefaultValueTag))
.Optional();
}

RET_CHECK(cc->OutputSidePackets().HasTag(kValueTag));
Expand Down
2 changes: 2 additions & 0 deletions mediapipe/calculators/image/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -410,7 +410,9 @@ cc_library(
srcs = ["image_properties_calculator.cc"],
visibility = ["//visibility:public"],
deps = [
"//mediapipe/framework/api2:node",
"//mediapipe/framework:calculator_framework",
"//mediapipe/framework/formats:image",
"//mediapipe/framework/formats:image_frame",
"//mediapipe/framework/port:ret_check",
"//mediapipe/framework/port:status",
Expand Down
Loading

0 comments on commit 017c1dc

Please sign in to comment.