Skip to content

[CVPR 2024] Official implementation of the paper "Towards Versatile Human-Human Interaction Analysis"

License

Notifications You must be signed in to change notification settings

liangxuy/Inter-X

Repository files navigation

Inter-X: Towards Versatile Human-Human Interaction Analysis

This repository contains the content of the following paper:

Inter-X: Towards Versatile Human-Human Interaction Analysis
Liang Xu1,2, Xintao Lv1, Yichao Yan1, Xin Jin2, Shuwen Wu1, Congsheng Xu1, Yifan Liu1, Yizhou Zhou3, Fengyun Rao3, Xingdong Sheng4, Yunhui Liu4, Wenjun Zeng2, Xiaokang Yang1
1 Shanghai Jiao Tong University 2 Eastern Institute of Technology, Ningbo 3WeChat, Tencent Inc. 4Lenovo

News

  • [2024.08.10] We release the training and evaluation codes of text2motion, and the checkpoints.
  • [2024.08.07] We release the data preprocessing code and the processed data.
  • [2024.06.02] We release the code of fitting SMPL-X parameters from the MoCap data here.
  • [2024.04.16] Release the Inter-X dataset, with SMPL-X parameters, skeleton parameters, and the annotations of textual descriptions, action settings, interaction order and relationships and personalities.
  • [2024.02.27] Inter-X is accepted by CVPR 2024!
  • [2023.12.27] We release the paper and project page of Inter-X.

TODO

  • Release the action-to-motion code and checkpoints.
  • Release the text-to-motion code and checkpoints.
  • Release the data pre-processing code.
  • Release the scripts to visualize the dataset.
  • Release the whole dataset and annotations.

Plese stay tuned for any updates of the dataset and code!

Dataset Comparison

Dataset Download

Please fill out this form to request authorization to download Inter-X for research purposes.

We also provide the 40 action categories and train/val/test splittings under the folder of datasets.

Data visualization

1. Visualize the SMPL-X parameters

We recommend to use the AIT-Viewer to visualize the dataset.

pip install aitviewer
pip install -r visualize/smplx_viewer_tool/requirements.txt

Installation

You need to download the SMPL-X models and then place them under visualize/smplx_viewer_tool/body_models.

├── SMPLX_FEMALE.npz
├── SMPLX_FEMALE.pkl
├── SMPLX_MALE.npz
├── SMPLX_MALE.pkl
├── SMPLX_NEUTRAL.npz
├── SMPLX_NEUTRAL.pkl
└── SMPLX_NEUTRAL_2020.npz

Usage

cd visualize/smplx_viewer_tool
# 1. Make sure the SMPL-X body models are downloaded
# 2. Create a soft link of the SMPL-X data to the smplx_viewer_tool folder
ln -s Your_Path_Of_SMPL-X ./data
# 3. Create a soft link of the texts annotations to the smplx_viewer_tool folder
ln -s Your_Path_Of_Texts ./texts
python data_viewer.py

2. Visualize the skeleton parameters

Usage

cd visualize/joint_viewer_tool
# 1. Create a soft link of the skeleton data to the joint_viewer_tool folder
ln -s Your_Path_Of_Joints ./data
# 2. Create a soft link of the texts annotations to the joint_viewer_tool folder
ln -s Your_Path_Of_Texts ./texts
python data_viewer.py

Data Loading

Each file/folder name of Inter-X is in the format of GgggTtttAaaaRrrr (e.g., G001T000A000R000), in which ggg is the human-human group number, ttt is the shoot number, aaa is the action label, and rrr is the split number.

The human-human group number is aligned with the big_five, familiarity annotations. The human-human group number starts from 001 to 059, the action label starts from 000 to 039.

The directory structure of the downloaded dataset is:

Inter-X_Dataset
├── LICENSE.md
├── annots
│   ├── action_setting.txt # 40 action categories
│   ├── big_five.npy # big-five personalities
│   ├── familiarity.txt # familiarity level, from 1-4, larger means more familiar
│   └── interaction_order.pkl # actor-reactor order, 0 means P1 is actor; 1 means P2 is actor
├── splits # train/val/test splittings
│   ├── all.txt
│   ├── test.txt
│   ├── train.txt
│   └── val.txt
├── motions.zip # SMPL-X parameters at 120 fps
├── skeletons.zip # skeleton parameters at 120 fps
└── texts.zip # textual descriptions
  • To load the SMPL-X motion parameters you can simply do:
import numpy as np

# load the motion data
motion = np.load('motions/G001T000A000R000/P1.npz')
motion_parms = {
            'root_orient': motion['root_orient'],  # controls the global root orientation
            'pose_body': motion['pose_body'],  # controls the body
            'pose_lhand': motion['pose_lhand'],  # controls the left hand articulation
            'pose_rhand': motion['pose_rhand'],  # controls the right hand articulation
            'trans': motion['trans'],  # controls the global body position
            'betas': motion['betas'],  # controls the body shape
            'gender': motion['gender'],  # controls the gender
        }
  • To load the skeleton data you can simply do:
# The topology of the skeleton can be obtained in the OPTITRACK_LIMBS, SELECTED_JOINTS of the joint_viewer_tool/data_viewer.py
import numpy as np
skeleton = np.load('skeletons/G001T000A000R000/P1.npy') # skeleton.shape: (T, 64, 3)

Data preprocessing

We directly use the SMPL-X parameters to train the model, you can download the processed motion data, text data through the original Google drive link or the Baidu Netdisk.

processed_data
├── glove
│   ├── hhi_vab_data.npy
│   ├── hhi_vab_idx.pkl
│   └── hhi_vab_words.pkl
├── motions
│   ├── test.h5
│   ├── train.h5
│   └── val.h5
└── texts_processed
    ├── G001T000A000R000.txt
    ├── G001T000A000R001.txt
    └── ......

Or, the data preprocessing code is shown as follows:

Commands for preprocessing the Inter-X dataset for training and evaluation:
  1. Please clone the repository by the following command:

    git clone https://github.com/liangxuy/Inter-X.git
    cd Inter-X/preprocessing
    
  2. Setup the environment

    • Install ffmpeg (if not already installed)
      sudo apt update
      sudo apt install ffmpeg
      
    • Setup conda environment
      conda env create -f environment.yml
      conda activate inter-x
      python -m spacy download en_core_web_sm
      pip install git+https://github.com/openai/CLIP.git
      
      You can also manually download and install en_core_web_sm by download the en_core_web_sm-2.3.0.tar.gz and then run pip install en_core_web_sm-2.3.0.tar.gz.
  3. Prepare the motions.zip, texts.zip, splits, etc.

  4. Run the commands one by one:

      1. Motion data processing, we downsample to 30 fps for training and evaluation
      python 1_prepare_data.py
      
      1. Split train, test and val
      python 2_split_train_val.py
      
      1. Processing text annotations

      Download the glove.6B.zip and set the path of glove_file.

      python 3_text_process.py
      
      1. For human reaction generation
      python 4_reaction_generation.py
      

Text to Motion

The code of this part is under evaluation/text2motion. We follow the work of text-to-motion to train and evaluate the text2motion model. You can build the environment as the original repository and then setup the data folder to ./dataset/inter-x with a soft link.

Commands for training and evaluating the text2motion model:

We have provided the trained models on the dataset Google Drive link. You can download the checkpoints and put them to checkpoints/hhi to skip the step 1~4 and organize them as:

checkpoints/hhi
├── Comp_v6_KLD01
│   ├── model
│   │   └── latest.tar
│   └── opt.txt
├── Decomp_SP001_SM001_H512
│   └── model
│       └── latest.tar
├── length_est_bigru
│   └── model
│       └── latest.tar
└── text_mot_match
    └── model
        └── finest.tar
  1. Training motion autoencoder

    python train_decomp_v3.py --name Decomp_SP001_SM001_H512 --gpu_id 0 --window_size 24 --dataset_name hhi
    
  2. Training text2length model

    python train_length_est.py --name length_est_bigru --gpu_id 0 --dataset_name hhi
    
  3. Training text2motion model

    python train_comp_v6.py --name Comp_v6_KLD01 --gpu_id 0 --lambda_kld 0.01 --dataset_name hhi
    
  4. Training motion & text feature extractors

    python train_tex_mot_match.py --name text_mot_match --gpu_id 0 --batch_size 8 --dataset_name hhi
    
  5. Quantitative Evaluations

    python final_evaluation.py
    

    The statistical results will be saved to ./hhi_evaluation.log.

Action to Motion

Coming soon!

Citation

If you find the Inter-X dataset is useful for your research, please cite us:

@inproceedings{xu2024inter,
  title={Inter-x: Towards versatile human-human interaction analysis},
  author={Xu, Liang and Lv, Xintao and Yan, Yichao and Jin, Xin and Wu, Shuwen and Xu, Congsheng and Liu, Yifan and Zhou, Yizhou and Rao, Fengyun and Sheng, Xingdong and others},
  booktitle={CVPR},
  pages={22260--22271},
  year={2024}
}