Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
47 changes: 47 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,15 +1,18 @@
# OSX
#
.DS_Store
**/.DS_Store

**/.xcode.env.local

# XDE
.expo/
.expo-shared/

# VSCode
.vscode/
jsconfig.json
*.code-workspace

# Xcode
#
Expand All @@ -30,6 +33,9 @@ DerivedData
*.ipa
*.xcuserstate
project.xcworkspace
iOS/Pods
*.dSYM.zip
*.dSYM

# Android/IJ
#
Expand All @@ -41,18 +47,29 @@ project.xcworkspace
.settings
local.properties
android.iml
**/*.iml
android/app/release
*.jks
*.keystore
!debug.keystore

# Cocoapods
#
example/ios/Pods

# Ruby
example/vendor/
.ruby-version

# node.js
#
node_modules/
npm-debug.log
yarn-debug.log*
yarn-error.log*
npm-error.log*
lerna-debug.log*
.yarn

# Bun
package-lock.json
Expand All @@ -66,6 +83,8 @@ android/keystores/debug.keystore

# Expo
.expo/
.expo-shared/
__generated__/

# Turborepo
.turbo/
Expand All @@ -75,3 +94,31 @@ lib/

# TypeScript
tsconfig.tsbuildinfo

# Jest
.jest/
coverage/

# ESLint
.eslintcache

# Metro
.metro-health-check*

# Environment files
.env
.env.*
!.env.example

# Misc
*.orig
*.swp
*.swo
*.log
.tmp/
tmp/
cache/
.cache/

# react-native-config codegen
ios/tmp.xcconfig
110 changes: 110 additions & 0 deletions example/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@
* Flash for video capture
* Activating/Pausing the Camera but keeping it "warm"
* Using the Example Frame Processor Plugin
* Pose Detection with TensorFlow Lite and MoveNet
</p>
</div>

Expand All @@ -46,3 +47,112 @@ bun bootstrap
1. Open the `example/android/` folder with Android Studio
2. Select your device in the devices drop-down
3. Hit run

## Pose Detection Implementation

The example app includes a human pose detection feature that demonstrates how to use Frame Processors with machine learning models. This implementation uses TensorFlow Lite with the MoveNet model to detect human poses in real-time.

### Architecture Overview

The pose detection implementation follows a plugin architecture with two main components:

1. **Native Swift Plugin**: Handles frame processing, model inference, and coordinate mapping
2. **React Native Component**: Renders the detected pose skeleton overlay

### Swift Implementation (iOS)

The Swift implementation is structured as a Frame Processor Plugin that processes each camera frame:

#### Plugin Structure

- **PoseDetectionFrameProcessor.swift**: Main implementation of the frame processor
- **PoseDetectionFrameProcessor.m**: Objective-C bridge for registering the plugin

#### Key Components

1. **Model Loading**: Loads the MoveNet TensorFlow Lite model (Thunder variant)
2. **Frame Preprocessing**:
- Rotates and crops the input frame to match model requirements
- Resizes the image to 256x256 pixels (model input size)
- Converts RGBA to RGB format
3. **Model Inference**:
- Runs the TensorFlow Lite model on the preprocessed frame
- Extracts keypoint coordinates and confidence scores
4. **Coordinate Transformation**:
- Transforms normalized model coordinates (0-1) to original frame coordinates
- Handles rotation, scaling, and cropping transformations
- Accounts for device orientation and camera mirroring

#### Keypoints and Connections

The model detects 17 keypoints representing body parts (nose, eyes, ears, shoulders, elbows, wrists, hips, knees, ankles) and defines connections between them to form a skeleton.

### TypeScript Implementation (React Native)

The React Native side renders the detected pose skeleton on top of the camera preview:

#### Component Structure

- **PoseSkeletonOverlay.tsx**: React component that renders SVG elements for the skeleton
- **PoseDetectionPlugin.ts**: TypeScript interface for the native plugin

#### Key Features

1. **Skeleton Rendering**:
- Renders lines between connected keypoints
- Renders circles at keypoint positions
- Filters keypoints based on confidence threshold
2. **Coordinate Mapping**:
- Maps coordinates from camera space to view space
- Handles mirroring for front camera
3. **Customization**:
- Configurable colors, line widths, and point sizes
- Adjustable confidence threshold

### Coordinate Transformation Challenges

One of the main challenges in the implementation is correctly mapping coordinates between different spaces:

1. **Model Space**: Normalized coordinates (0-1) from the ML model
2. **Camera Space**: Original camera frame coordinates
3. **View Space**: Coordinates in the React Native view

Additional complexities include:

- **Rotation**: Camera frames may be rotated 90° from the display orientation
- **Mirroring**: Front camera requires horizontal flipping
- **Aspect Ratio**: Handling different aspect ratios between camera and display

### Usage

The pose detection can be enabled in the camera view with configuration options:

```jsx
<Camera
style={styles.camera}
device={device}
isActive={isActive}
frameProcessor={frameProcessor}
frameProcessorFps={5}
>
<PoseSkeletonOverlay
poseData={poseData}
mirrored={isFrontCamera}
confidenceThreshold={0.3}
/>
</Camera>
```

The frame processor can be configured with:

```javascript
const frameProcessor = useFrameProcessor((frame) => {
'worklet';
const poses = pose_detection_plugin(frame, {
modelType: 'thunder'
minConfidence: 0.3,
drawSkeleton: true
});
runOnJS(setPoseData)(poses);
}, []);
```
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
//
// PoseDetectionFrameProcessor.m
// VisionCameraExample
//

#if __has_include(<VisionCamera/FrameProcessorPlugin.h>)
#import <VisionCamera/FrameProcessorPlugin.h>
#import <VisionCamera/FrameProcessorPluginRegistry.h>

#import "VisionCameraExample-Swift.h"

// Swift Frame Processor plugin registration
VISION_EXPORT_SWIFT_FRAME_PROCESSOR(PoseDetectionFrameProcessorPlugin, pose_detection_plugin)

#endif
Loading