WonderZoom: Multi-Scale 3D World Generation
Jin Cao*, Koven Yu*, Jiajun Wu
(* denotes equal contribution)
WonderZoom generates multi-scale 3D worlds from a single image. Starting from an input photograph, it constructs a 3D Gaussian Splatting scene that supports continuous zoom-in navigation — revealing new details and objects at each scale level.
This release includes:
- Pre-trained 3D scenes for interactive real-time viewing
- Render-only server with web-based frontend
- Full generation pipeline (coming in a future release)
git clone https://github.com/jin-cao-tma/WonderZoom.git
cd WonderZoomconda create -n wonderzoom python=3.10 -y
conda activate wonderzoom
# Install PyTorch (CUDA 12.1)
pip install torch==2.1.0 torchvision==0.16.0 --index-url https://download.pytorch.org/whl/cu121
# Install dependencies
pip install -r requirements.txtcd submodules/depth-diff-gaussian-rasterization-min
pip install -e .
cd ../simple-knn
pip install -e .
cd ../..pip install "git+https://github.com/facebookresearch/pytorch3d.git@stable"Download the .pth files from our HuggingFace dataset and place them in the gaussian/ directory:
# Using huggingface-cli
pip install huggingface_hub
huggingface-cli download TmaKiss/WonderZoom --repo-type dataset --local-dir ./Or download individual scenes:
from huggingface_hub import hf_hub_download
scenes = [
"gau_bird3_complete1080.pth", # street scene
"gau_fish1_complete1080.pth", # coral reef
"gau_beetle1_complete1080.pth", # tree / beetle
"gau_conch1_complete1080.pth", # beach
"gau_lizard_complete1080.pth", # wooden wall / lizard
"gau_ladybug1_complete1080.pth", # sunflower / ladybug
"gau_butterfly_complete1080.pth",# tea garden / butterfly
"gau_butterfly_complete.pth", # tea garden (480p)
"gau_conch_complete.pth", # beach (480p)
"gau_lego_complete.pth", # lego (480p)
]
for scene in scenes:
hf_hub_download(
repo_id="TmaKiss/WonderZoom",
repo_type="dataset",
filename=f"gaussian/{scene}",
local_dir="./",
)conda activate wonderzoom
# Pick a scene config
python run_render_only.py --example_config ./config/more_examples/street.yaml --port 7747Available scene configs:
| Config | Scene | Resolution |
|---|---|---|
street.yaml |
City street with bird | 720x1080 |
fish.yaml |
Coral reef with fish | 720x1080 |
tree.yaml |
Lakeside tree with beetle | 720x1080 |
beach.yaml |
Beach with conch shell | 720x1080 |
sunflower.yaml |
Sunflower field with ladybug | 720x1080 |
wooden.yaml |
Wooden wall with lizard | 720x1080 |
tea_garden.yaml |
Tea garden with butterfly | 720x1080 |
lego.yaml |
Lego scene | 480x720 |
beach2.yaml |
Beach (480p) | 480x720 |
tea_garden2.yaml |
Tea garden (480p) | 480x720 |
You can also specify a .pth file directly:
python run_render_only.py --pth_path ./gaussian/gau_bird3_complete1080.pth \
--example_config ./config/more_examples/street.yaml --port 7747Streaming quality: If the viewer feels laggy over SSH, you can reduce the streaming quality by editing these two lines at the top of run_render_only.py:
IMAGE_COMPRESSION_QUALITY = 80 # lower (e.g. 20) = faster but blurrier
MAX_IMAGE_SIZE = 1080 # lower (e.g. 512) = faster streamingIf running on a remote server, set up SSH port forwarding:
ssh -L 7747:localhost:7747 <your-server>Then open splat-main/index_stream.html in your browser.
| Key | Action |
|---|---|
| W / A / S / D | Rotate camera |
| ↑ / ↓ / ← / → | Move camera |
| V | Zoom in |
| B | Zoom out |
| Space | Orbit rotation |
WonderZoom/
├── run_render_only.py # Render server (load .pth + real-time 3DGS rendering)
├── config/
│ ├── base-config.yaml # Base configuration
│ └── more_examples/ # Per-scene configs
├── gaussian/ # Pre-trained .pth files (download from HuggingFace)
├── gaussian_renderer/ # 3D Gaussian Splatting renderer
├── scene/ # GaussianModel and camera definitions
├── models/ # Point cloud processing models
├── splat-main/ # Web frontend viewer
├── submodules/ # Custom CUDA extensions
│ ├── depth-diff-gaussian-rasterization-min/
│ └── simple-knn/
├── util/ # Utility functions
└── utils/ # Loss functions and general utilities
@misc{wonderzoom,
title={WonderZoom: Multi-Scale 3D World Generation},
author={Jin Cao and Hong-Xing Yu and Jiajun Wu},
year={2025},
eprint={2512.09164},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2512.09164}
}
- Release rendering and interactive visualization code
- Release Chain-of-Zoom integration for multi-scale zoom-in generation
- Release Gen3C integration for high-quality novel view synthesis
- Release Step1X-Edit integration for object editing
- [CVPR2025 Highlight] WonderWorld: Interactive 3D Scene Generation from a Single Image
We appreciate the authors of the following projects for sharing their code: 3D Gaussian Splatting, MoGe, GeometryCrafter, Marigold, OneFormer, RepViT-SAM, PyTorch3D, Chain-of-Zoom, Gen3C, Step1X-Edit, Stable Diffusion, VGGT, Kornia, and INR-Harmonization.

