Skip to content

smileformylove/SmartFreeEdit

Repository files navigation

SmartFreeEdit: Mask-Free Spatial-Aware Image Editing with Complex Instruction Understanding

Qianqian Sun1*, Jixiang Luo2†, Dell Zhang2, Xuelong Li2

1 The University of Hongkong, 2 Institute of Artificial Intelligence (TeleAI)

arXiv Huggingface space

We propose SmartFreeEdit to address the challenge of reasoning instructions and segmentations in image editing, thereby enhancing the practicality of AI editing. Our method effectively handles some semantic editing operations, including adding, removing, changing objects, background changing and global editing.

⏱️ Update News

  • [2025.7.07] Webpage has been released!
  • [2025.7.05] Our paper has been accpeted by ACMMM'2025 and Code for image editing is released!
  • [2025.4.17] Our paper has been released on arxiv Papers and is currently under review for ACM Multimedia 2025 (ACMMM 2025).

📖 Pipeline

Our SmartFreeEdit consists of three key components: 1) An MLLM-driven Promptist that decomposes instructions into Editing Objects, Category, and Target Prompt. 2) Reasoning segmentation converts the prompt into an inference query and generates reasoning masks. 3) An Inpainting-based Image Editor using the hypergraph computation module to enhance global image structure understanding for more accurate edits.

🚀 Getting Started

Environment Requirement 🌍

Clone the repo:

git clone https://github.com/smileformylove/SmartFreeEdit

We recommend you first use conda to create virtual environment, then run:

conda create -n smartfreeedit python=3.10.6 -y
conda activate smartfreeedit
python -m pip install --upgrade pip
pip install torch==2.0.1 torchvision==0.15.2 torchaudio==2.0.2 --index-url https://download.pytorch.org/whl/cu117

Then, you can install diffusers (implemented in this repo) with:

pip install -e .

Finally, you should install the remaining environment:

pip install -r requirements.txt
pip install flash-attn --no-build-isolation

Downloading Checkpoints

Checkpoints of SmartFreeEdit can be downloaded using the following command:

python SmartFreeEdit/download.py

Running Gradio demo

We provide a demo scripts for different hardware configurations. For users with server access and sufficient CPU/GPU memory ( >40/24 GB), we recommend you use:

export PYTHONPATH=.:$PYTHONPATH

export CUDA_VISIBLE_DEVICES=0

python SmartFreeEdit/src/smartfreeedit_app.py

Training

To train SmartFreeEdit, you need to download the BrushData here and Checkpoints of BrushNet here.

The data strcture should be like:

|-- data
    |-- BrushData
    |-- BrushDench
    |-- EditBench
    |-- ckpt
        |-- realisticVisionV60B1_v51VAE
            |-- model_index.json
            |-- vae
            |-- ...
        |-- segmentation_mask_brushnet_ckpt
        |-- segmentation_mask_brushnet_ckpt_sdxl_v0
        |-- random_mask_brushnet_ckpt
        |-- random_mask_brushnet_ckpt_sdxl_v0
        |-- ...

Train with segmentation mask using the script:

accelerate launch train/brushnet/train_brushnet.py \
--pretrained_model_name_or_path data/base_model/realisticVisionV60B1_v51VAE \
--output_dir examples/hyper \
--train_data_dir data/BrushData \
--resolution 512 \
--learning_rate 1e-5 \
--train_batch_size 8 \
--tracker_project_name brushnet \
--report_to tensorboard \
--validation_steps 300

You can inference with the script:

python train/brushnet/test_brushnet.py

You can evaluate using the script:

python train/brushnet/evaluate_brushnet.py \
--brushnet_ckpt_path data/ckpt/segmentation_mask_brushnet_ckpt \
--image_save_path runs/evaluation_result/BrushBench/brushnet_segmask/inside \
--mapping_file data/BrushBench/mapping_file.json \
--base_dir data/BrushBench \
--mask_key inpainting_mask

Inference

Please download Reason-Edit evaluation benchmark from SmartEdit and put it in file dataset.

Use the script to inference on understanding and reasoning scenes:


python test/ReasonEdit_test.py --save_dir /SmartFreeEdit/edited_images --ReasonEdit_benchmark_dir /dataset/Reasonedit

🖋️ Citation

If you find our work helpful, please star ⭐this repo and cite 📑 our paper. Thanks for your support!

@article{sun2025smartfreeedit,
  title={SmartFreeEdit: Mask-Free Spatial-Aware Image Editing with Complex Instruction Understanding},
  author={Sun, Qianqian and Luo, Jixiang and Zhang, Dell and Li, Xuelong},
  journal={arXiv preprint arXiv:2504.12704},
  year={2025}
}

📧 Contact

This repository is currently under active development and restructuring. The codebase is being optimized for better stability and reproducibility. While we strive to maintain code quality, you may encounter temporary issues during this transition period. For any questions or technical discussions, feel free to open an issue or contact us via email at [email protected].

👍🏻 Acknowledgements

Our code is modified based on BrushNet, thanks to all the contributors!

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages