BBoxMaskPose v2

CVPR 2025 + ICCV 2025

Important

The new version of BBox-Mask-Pose (BMPv2) is now available on arXiv. BMPv2 significantly improves performance; see the quantitative results reported in the preprint. One of the key contributions is PMPose, a new top-down pose estimation model, that is already strong on standard benchmarks and in crowded scenes. The code is integrated in the main branch and was released in Release 2.0.0. Due to repository changes, the version 2.0.0 is not backward compatible with previous versions.

📢 News

Mar 2025: HuggingFace Image Demo is up-to-date with BMPv2. Check-out the 3D generation!
Mar 2026: Version 2.0 with improved (1) pose and (2) SAM and (3) wiring to 3D prediction released.
Feb 2026: SAM-pose2seg won a Best Paper Award on CVWW 2026 🎉
Jan 2026: BMPv2 paper is available on arXiv
Aug 2025: HuggingFace Image Demo is out! 🎮
Jul 2025: Version 1.1 with easy-to-run image demo released
Jun 2025: BMPv1 paper accepted to ICCV 2025! 🎉
Dec 2024: BMPv1 code is available
Nov 2024: The project website is on

📋 Project Overview

Bounding boxes, masks, and poses capture complementary aspects of the human body. BBoxMaskPose links detection, segmentation, and pose estimation iteratively, where each prediction refines the others. PMPose combines probabilistic modeling with mask conditioning for robust pose estimation in crowds. Together, these components achieve state-of-the-art results on COCO and OCHuman, being the first method to exceed 50 AP on OCHuman.

Repository Structure

The repository is organized into two main packages with stable public APIs:

BBoxMaskPose/
├── pmpose/                    # PMPose package (pose estimation)
│   └── pmpose/
│       ├── api.py             # PUBLIC API: PMPose class
│       ├── mm_utils.py        # Internal utilities
│       └── posevis_lite.py    # Visualization
├── mmpose/                    # MMPose fork with our edits
├── bboxmaskpose/              # BBoxMaskPose package (full pipeline)
│   └── bboxmaskpose/
│       ├── api.py             # PUBLIC API: BBoxMaskPose class
│       ├── sam2/              # SAM2 implementation
│       ├── configs/           # BMP configurations
│       └── *_utils.py         # Internal utilities
├── demos/                     # Public API demos
│   ├── PMPose_demo.py         # PMPose usage example
│   ├── BMP_demo.py            # BBoxMaskPose usage example
│   └── quickstart.ipynb       # Interactive notebook
└── demo/                      # Legacy demo (still functional)

Key contributions:

MaskPose: a pose estimation model conditioned by segmentation masks instead of bounding boxes, boosting performance in dense scenes without adding parameters
- Download pre-trained weights below
PMPose: a pose estimation model modelling the full keypoint probability distribution AND conditioned by segmentation masks instead of bounding boxes, boosting performance in dense scenes without adding parameters
- Download pre-trained weights below
BBox-MaskPose (BMP): method linking bounding boxes, segmentation masks, and poses to simultaneously address multi-body detection, segmentation and pose estimation
- Try the demo!
SAM-pose2seg: fine-tuned SAM2 model for pose-guided instance segmentation
- Try the demo!
Fine-tuned RTMDet adapted for itterative detection (ignoring 'holes')
- Download pre-trained weights below
Support for multi-dataset training of ViTPose, previously implemented in the official ViTPose repository but absent in MMPose.

For more details, please visit our project website.

🎮 HuggingFace Demos

If you want to try our models without any installation, you can try the free HuggingFace demos.

BBoxMaskPose Demo showcases the whole loop including 3D pose estimation. You can generate GIFs similar to the one at the top of this README. Due to 3D rendering, this demo runs approx 30-60 seconds per image.

PMPose Demo showcase our familly of PMPose models. It is not an itterative methods but standard feed-forward top-down 2D pose estimation method. Check it out if you're interested in fast pose estimation.

🚀 Installation

Docker Installation (Recommended)

The fastest way to get started with GPU support:

# Clone and build
git clone https://github.com/mirapurkrabek/BBoxMaskPose.git
cd BBoxMaskPose
docker-compose build

# Run the demo
docker-compose up

Requires: Docker Engine 19.03+, NVIDIA Container Toolkit, NVIDIA GPU with CUDA 12.1 support.

Manual Installation

This project is built on top of MMPose and SAM 2.1. Please refer to the MMPose installation guide or SAM installation guide for detailed setup instructions.

Basic installation steps:

# Clone the repository
git clone https://github.com/mirapurkrabek/BBoxMaskPose.git BBoxMaskPose/
cd BBoxMaskPose

# Install your version of torch, torchvision, OpenCV and NumPy
pip install torch==2.1.2+cu121 torchvision==0.16.2+cu121 --extra-index-url https://download.pytorch.org/whl/cu121
pip install numpy==1.25.1 opencv-python==4.9.0.80

# Install MMLibrary
pip install -U openmim
mim install mmengine "mmcv==2.1.0" "mmdet==3.3.0" "mmpretrain==1.2.0"

# Install dependencies
pip install -r requirements.txt
pip install -e .

🎮 Demo

PMPose Demo (Pose Estimation Only)

python demos/PMPose_demo.py --image data/004806.jpg --device cuda

BBoxMaskPose Demo (Full Pipeline)

python demos/BMP_demo.py --image data/004806.jpg --device cuda

After running the demo, outputs are in outputs/004806/. The expected output should look like this:

BBoxMaskPose v2 Demo (Full Pipeline + 3D Mesh Recovery)

This demo extends BMP with SAM-3D-Body for 3D human mesh recovery:

# Basic usage (auto-downloads checkpoint from HuggingFace)
python demos/BMPv2_demo.py --image data/004806.jpg --device cuda

# With local checkpoint
python demos/BMPv2_demo.py --image data/004806.jpg --device cuda \
    --sam3d_checkpoint checkpoints/sam-3d-body-dinov3/model.ckpt \
    --mhr_path checkpoints/sam-3d-body-dinov3/assets/mhr_model.pt

SAM-3D-Body Installation (Optional): BMPv2 requires SAM-3D-Body for 3D mesh recovery. Install it separately:

# 1. Install dependencies
pip install -r requirements/sam3d.txt

# 2. Install detectron2
pip install 'git+https://github.com/facebookresearch/detectron2.git@a1ce2f9' --no-build-isolation --no-deps

# 3. Install MoGe (optional, for FOV estimation)
pip install git+https://github.com/microsoft/MoGe.git

# 4. Install adapted SAM-3D-Body repository
pip install git+https://github.com/MiraPurkrabek/sam-3d-body.git

# 5. Request access to checkpoints at https://huggingface.co/facebook/sam-3d-body-dinov3

For more details, see SAM-3D-Body installation guide.

Jupyter Notebook

Interactive demo with both PMPose and BBoxMaskPose:

jupyter notebook demos/quickstart.ipynb

API Examples

PMPose API - Pose estimation with bounding boxes:

from pmpose import PMPose

# Initialize model
pose_model = PMPose(device="cuda", from_pretrained=True)

# Run inference
keypoints, presence, visibility, heatmaps = pose_model.predict(
    image="demo/data/004806.jpg",
    bboxes=[[100, 100, 300, 400]],  # [x1, y1, x2, y2]
)

# Visualize
vis_img = pose_model.visualize(image="demo/data/004806.jpg", keypoints=keypoints)

BBoxMaskPose API - Full detection + pose + segmentation:

from pmpose import PMPose
from bboxmaskpose import BBoxMaskPose

# Create pose model
pose_model = PMPose(device="cuda", from_pretrained=True)

# Inject into BMP
bmp_model = BBoxMaskPose(config="BMP_D3", device="cuda", pose_model=pose_model)
result = bmp_model.predict(image="demo/data/004806.jpg")

# Visualize
vis_img = bmp_model.visualize(image="demo/data/004806.jpg", result=result)

📦 Pre-trained Models

Pre-trained models are available on VRG Hugging Face 🤗. To run the demo, don't need to download any weight manually. The detector, SAM-pose2seg and pose estimator will be downloaded during the runtime.

If you want to download our weights yourself, here are the links to our HuggingFace:

ViTPose-b trained on COCO+MPII+AIC -- download weights
MaskPose-b -- download weights
PMPose -- select model
SAM-pose2seg -- download weights
Fine-tuned RTMDet-L -- download weights

🙏 Acknowledgments

The code combines MMDetection, MMPose 2.0, ViTPose, SAM 2.1 and SAM-3D-Body.

Our visualizations integrate Distinctipy for automatic color selection.

This repository combines our work on BBoxMaskPose project with our previous work on probabilistic 2D human pose estimation modelling.

📝 Citation

The code was implemented by Miroslav Purkrábek and Constantin Kolomiiets. If you use this work, kindly cite it using the references provided below.

For questions, please use the Issues of Discussion.

@InProceedings{Purkrabek2025BMPv1,
    author    = {Purkrabek, Miroslav and Matas, Jiri},
    title     = {Detection, Pose Estimation and Segmentation for Multiple Bodies: Closing the Virtuous Circle},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    month     = {October},
    year      = {2025},
    pages     = {9004-9013}
}

@InProceedings{Purkrabek2026BMPv2,
    author    = {Purkrabek, Miroslav and Kolomiiets, Constantin and Matas, Jiri},
    title     = {BBoxMaskPose v2: Expanding Mutual Conditioning to 3D},
    booktitle = {arXiv preprint arXiv:2601.15200},
    year      = {2026}
}

@article{yang2025sam3dbody,
  title={SAM 3D Body: Robust Full-Body Human Mesh Recovery},
  author={Yang, Xitong and Kukreja, Devansh and Pinkus, Don and Sagar, Anushka and Fan, Taosha and Park, Jinhyung and Shin, Soyong and Cao, Jinkun and Liu, Jiawei and Ugrinovic, Nicolas and Feiszli, Matt and Malik, Jitendra and Dollar, Piotr and Kitani, Kris},
  journal={arXiv preprint; identifier to be added},
  year={2025}
}

@InProceedings{Kolomiiets2026CVWW,
    author    = {Kolomiiets, Constantin and Purkrabek, Miroslav and Matas, Jiri},
    title     = {SAM-pose2seg: Pose-Guided Human Instance Segmentation in Crowds},
    booktitle = {Computer Vision Winter Workshop (CVWW)},
    year      = {2026}
}

Name		Name	Last commit message	Last commit date
Latest commit History 50 Commits
bboxmaskpose		bboxmaskpose
data		data
demos		demos
docker		docker
mmpose		mmpose
pmpose		pmpose
requirements		requirements
.dockerignore		.dockerignore
.gitignore		.gitignore
CITATION.cff		CITATION.cff
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
SAM3D_INTEGRATION.md		SAM3D_INTEGRATION.md
docker-compose.yml		docker-compose.yml
requirements.txt		requirements.txt
setup.py		setup.py
version.py		version.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BBoxMaskPose v2

CVPR 2025 + ICCV 2025

📢 News

📑 Table of Contents

📋 Project Overview

Repository Structure

🎮 HuggingFace Demos

🚀 Installation

Docker Installation (Recommended)

Manual Installation

🎮 Demo

PMPose Demo (Pose Estimation Only)

BBoxMaskPose Demo (Full Pipeline)

BBoxMaskPose v2 Demo (Full Pipeline + 3D Mesh Recovery)

Jupyter Notebook

API Examples

📦 Pre-trained Models

🙏 Acknowledgments

📝 Citation

About

Uh oh!

Releases 2

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

BBoxMaskPose v2

CVPR 2025 + ICCV 2025

📢 News

📑 Table of Contents

📋 Project Overview

Repository Structure

🎮 HuggingFace Demos

🚀 Installation

Docker Installation (Recommended)

Manual Installation

🎮 Demo

PMPose Demo (Pose Estimation Only)

BBoxMaskPose Demo (Full Pipeline)

BBoxMaskPose v2 Demo (Full Pipeline + 3D Mesh Recovery)

Jupyter Notebook

API Examples

📦 Pre-trained Models

🙏 Acknowledgments

📝 Citation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Uh oh!

Contributors

Uh oh!

Languages