Generating Human Interaction Motions in Scenes with Text Control

Description

This is the official PyTorch implementation of the paper "Generating Human Interaction Motions in Scenes with Text Control" (ECCV2024). For more details, please visit our project webpage

bibtex

if you find the code useful, kindly cite our work:

@inproceedings{yi2024tesmo,
    author={Yi, Hongwei and Thies, Justus and Black, Michael J. and Peng, Xue Bin and Rempe, Davis},
    title={Generating Human Interaction Motions in Scenes with Text Control},
    booktitle = {ECCV},
    year={2024}
}

Getting started

This code was tested on Ubuntu 20.04.5 LTS and requires:

Python 3.8
conda3 or miniconda3
CUDA capable GPU (one is enough)

1. Setup environment

Install ffmpeg (if not already installed):

sudo apt update
sudo apt install ffmpeg

Setup conda env:

conda env create -f environment.yml
conda activate tesmo
python -m spacy download en_core_web_sm
pip install git+https://github.com/openai/CLIP.git
pip install git+https://github.com/GuyTevet/smplx.git
pip install wanbd
pip install "git+https://github.com/facebookresearch/pytorch3d.git"
pip install human-body-prior # this will leads to bm_name error
pip install moviepy
pip install hydra
git clone https://github.com/nghorbani/human_body_prior.git
python setup.py install
pip install python-fcl
# install mmcv
pip install mmcv==2.0.0rc4 -f https://download.openmmlab.com/mmcv/dist/cu117/torch1.13/index.html
# visualization
pip install torchshow
pip install matplotlib==3.1.3
# pip install mesh-to-sdf
pip install git+https://github.com/marian42/mesh_to_sdf.git
pip install open3d
pip install plyfile
pip install numpy==1.23.4

2. Get PriorMDM dependencies

PriorMDM share most of its dependencies with the original MDM. Please follow the official instructions to install the dependencies.

3. Download the pretrained models

Download the pretrained models from interaction model and locomotion model, then unzip and place it in pretrained_model/.

4. Dataset setup

Loco-3D-Front

Follow the insturctions in data_generation to get the bird-view floor plan and object mask from 3D-FRONT. Then, align human motions with 3D scenes by runing align_motion_amass. All the fitting results will be saved to dataset/3dfront_fitting folder.
Finally, run this command to get canonical motion-scene pairs for training and inference:
```
bash scripts/preprocess_locomotion.sh
```
The results will be saved to dataset/3dfront_fitting.

Human-Object Interaction

Follow the insturctions in data_generation to fit objects to human motions. Then, place the fitting results under dataset/SAMP_interaction/SAMP_objects folder.
Download the 3D-FUTURE dataset from this webpage, then place the downloaded dataset under dataset/3D-FUTURE-model folder
Download the original SAMP dataset from this webpage, then place it under dataset/SAMP_interaction/body_meshes_smpl folder.

Inference

Locomotion

Before running human locomotion generation, please download pretrained root_horizontal_control model from PriorMDM, and put it under folder pretrained_model/inpainting.

To run locomotion inference on HumanML3D, you can run this command:

bash run_sh/run_infer_locomotion.sh

To render generated human motion in 3D scene, you can run this command:

bash run_sh/run_render_locomotion.sh [samplie_id] # sample_id: 0, 1, 2 ...

Finally, you can get the visualization result like this:

Interaction

To evaluate a pretrained interaction model on SAMP, you can run this command to start inference:

bash run_sh/run_infer_interaction.sh

Use --joint2mesh to run SMPLify.
Use --render_pyrender to visualize human-object interaction.

Finally, you can get the visualization results like this:

License

This codebase and pre-trained models are provided under this non-commercial license. As detailed below, parts of the code build on prior work with their own licenses.

Acknowledgments

Our implementation takes advantage of several great prior works that have their own licenses that must be followed:

Some code is from PriorMDM
We took reference from OmniControl
We borrow some data generation code from SUMMON
Data processing code from HUMANISE

We thanks these authors for their open source contributions.

Our models are trained on data that makes use of 3D-FRONT and 3D-FUTURE datasets.

Thanks Yangyi Huang, Yifei Liu and Xiaolin Hong for technical support. Thanks Mathis Petrovich and Nikos Athanasiou for the fruitful discussion about text-to-motion synthesis. Thanks T. Niewiadomski, T. McConnell, and T. Alexiadis for running the user study.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
assets		assets
data_generation		data_generation
data_loaders		data_loaders
dataset		dataset
diffusion		diffusion
licenses		licenses
model		model
run_sh		run_sh
sample		sample
scripts		scripts
utils		utils
visualize		visualize
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
modelcard.md		modelcard.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Generating Human Interaction Motions in Scenes with Text Control

Description

bibtex

Getting started

1. Setup environment

2. Get PriorMDM dependencies

3. Download the pretrained models

4. Dataset setup

Loco-3D-Front

Human-Object Interaction

Inference

Locomotion

Interaction

License

Acknowledgments

About

Releases

Packages

Languages

License

nv-tlabs/tesmo

Folders and files

Latest commit

History

Repository files navigation

Generating Human Interaction Motions in Scenes with Text Control

Description

bibtex

Getting started

1. Setup environment

2. Get PriorMDM dependencies

3. Download the pretrained models

4. Dataset setup

Loco-3D-Front

Human-Object Interaction

Inference

Locomotion

Interaction

License

Acknowledgments

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages