You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+55-30Lines changed: 55 additions & 30 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,12 +5,13 @@ This repository contains SoTA algorithms, models, and interesting projects in th
5
5
ONE is short for "ONE for all"
6
6
7
7
## News
8
-
-[2025.02.21] We support DeepSeek [Janus-Pro](https://huggingface.co/deepseek-ai/Janus-Pro-7B), a SoTA multimodal understanding and generation model. See [here](examples/janus) 🔥
8
+
-[2025.04.10] We release MindONE [v0.3.0](https://github.com/mindspore-lab/mindone/releases/tag/v0.3.0). More than 15 SoTA generative models are added, including Flux, CogView4, OpenSora2.0, Movie Gen 30B , CogVideoX 5B~30B. Have fun!
9
+
-[2025.02.21] We support DeepSeek [Janus-Pro](https://huggingface.co/deepseek-ai/Janus-Pro-7B), a SoTA multimodal understanding and generation model. See [here](examples/janus)
9
10
-[2024.11.06] MindONE [v0.2.0](https://github.com/mindspore-lab/mindone/releases/tag/v0.2.0) is released
10
11
11
12
## Quick tour
12
13
13
-
To install MindONE v0.2.0, please install [MindSpore 2.3.1](https://www.mindspore.cn/install) and run `pip install mindone`
14
+
To install MindONE v0.3.0, please install [MindSpore 2.5.0](https://www.mindspore.cn/install) and run `pip install mindone`
14
15
15
16
Alternatively, to install the latest version from the `master` branch, please run.
16
17
```
@@ -39,35 +40,59 @@ prompt = "A cat holding a sign that says 'Hello MindSpore'"
39
40
image = pipe(prompt)[0][0]
40
41
image.save("sd3.png")
41
42
```
42
-
43
-
### supported models under mindone/examples
44
-
| model | features
45
-
| :--- | :-- |
46
-
|[cambrian](https://github.com/mindspore-lab/mindone/blob/master/examples/cambrain)| working on it |
47
-
|[minicpm-v](https://github.com/mindspore-lab/mindone/blob/master/examples/minicpm_v)| working on v2.6 |
48
-
|[internvl](https://github.com/mindspore-lab/mindone/blob/master/examples/internvl)| working on v1.0 v1.5 v2.0 |
49
-
|[llava](https://github.com/mindspore-lab/mindone/blob/master/examples/llava)| working on llava 1.5 & 1.6 |
50
-
|[vila](https://github.com/mindspore-lab/mindone/blob/master/examples/vila)| working on it |
51
-
|[pllava](https://github.com/mindspore-lab/mindone/blob/master/examples/pllava)| working on it |
52
-
|[hpcai open sora](https://github.com/mindspore-lab/mindone/blob/master/examples/opensora_hpcai)| support v1.0/1.1/1.2 large scale training with dp/sp/zero |
53
-
|[open sora plan](https://github.com/mindspore-lab/mindone/blob/master/examples/opensora_pku)| support v1.0/1.1/1.2 large scale training with dp/sp/zero |
54
-
|[stable diffusion](https://github.com/mindspore-lab/mindone/blob/master/examples/stable_diffusion_v2)| support sd 1.5/2.0/2.1, vanilla fine-tune, lora, dreambooth, text inversion|
|[dit](https://github.com/mindspore-lab/mindone/blob/master/examples/dit)| support text to image fine-tune |
57
-
|[latte](https://github.com/mindspore-lab/mindone/blob/master/examples/latte)| support unconditional text to image fine-tune |
58
-
|[animate diff](https://github.com/mindspore-lab/mindone/blob/master/examples/animatediff)| support motion module and lora training |
59
-
|[video composer](https://github.com/mindspore-lab/mindone/tree/master/examples/videocomposer)| support conditional video generation with motion transfer and etc.|
|[dynamicrafter](https://github.com/mindspore-lab/mindone/blob/master/examples/dynamicrafter)| support image to video generation |
63
-
|[hunyuan_dit](https://github.com/mindspore-lab/mindone/blob/master/examples/hunyuan_dit)| support text to image fine-tune |
64
-
|[pixart_sigma](https://github.com/mindspore-lab/mindone/blob/master/examples/pixart_sigma)| support text to image fine-tune at different aspect ratio |
65
-
66
43
### run hf diffusers on mindspore
67
-
mindone diffusers is under active development, most tasks were tested with mindspore 2.3.1 and ascend 910 hardware.
44
+
- mindone diffusers is under active development, most tasks were tested with mindspore 2.5.0 on Ascend Atlas 800T A2 machines.
45
+
- compatibale with hf diffusers 0.32.2
68
46
69
47
| component | features
70
48
| :--- | :--
71
-
| [pipeline](https://github.com/mindspore-lab/mindone/tree/master/mindone/diffusers/pipelines) | support text2image,text2video,text2audio tasks 30+
72
-
| [models](https://github.com/mindspore-lab/mindone/tree/master/mindone/diffusers/models) | support audoencoder & transformers base models same as hf diffusers
73
-
| [schedulers](https://github.com/mindspore-lab/mindone/tree/master/mindone/diffusers/schedulers) | support ddpm & dpm solver 10+ schedulers same as hf diffusers
49
+
| [pipeline](https://github.com/mindspore-lab/mindone/tree/master/mindone/diffusers/pipelines) | support text-to-image,text-to-video,text-to-audio tasks 160+
50
+
| [models](https://github.com/mindspore-lab/mindone/tree/master/mindone/diffusers/models) | support audoencoder & transformers base models same as hf diffusers 50+
51
+
| [schedulers](https://github.com/mindspore-lab/mindone/tree/master/mindone/diffusers/schedulers) | support diffusion schedulers (e.g., ddpm and dpm solver) same as hf diffusers 35+
52
+
53
+
### supported models under mindone/examples
54
+
55
+
| task | model | inference | finetune | pretrain | institute |
Copy file name to clipboardExpand all lines: examples/README.md
+27-11Lines changed: 27 additions & 11 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,22 +5,38 @@
5
5
6
6
| model | codebase style | original repo
7
7
| :--- | :-- | :-
8
+
|[cogview](https://github.com/mindspore-lab/mindone/blob/master/examples/cogview)| THUDM official |https://github.com/THUDM/CogView4|
9
+
|[wan2_1](https://github.com/mindspore-lab/mindone/blob/master/examples/wan2_1)| Alibaba Wan Group official|https://github.com/Wan-Video/Wan2.1|
10
+
|[step_video_t2v](https://github.com/mindspore-lab/mindone/blob/master/examples/step_video_t2v)| StepFun official |https://github.com/stepfun-ai/Step-Video-T2V|
11
+
|[janus](https://github.com/mindspore-lab/mindone/blob/master/examples/janus)| DeepSeek AI official |https://github.com/deepseek-ai/Janus|
12
+
|[emu3](https://github.com/mindspore-lab/mindone/blob/master/examples/emu3)| BAAIVision official |https://github.com/baaivision/Emu3|
13
+
|[var](https://github.com/mindspore-lab/mindone/blob/master/examples/var)| ByteDance FoundationVision official |https://github.com/FoundationVision/VAR|
14
+
| [hpcai open sora](https://github.com/mindspore-lab/mindone/blob/master/examples/opensora_hpcai) | HPC-AI Tech official | https://github.com/hpcaitech/Open-Sora
15
+
| [open sora plan](https://github.com/mindspore-lab/mindone/blob/master/examples/opensora_pku) | PKU-YuanGroup official | https://github.com/PKU-YuanGroup/Open-Sora-Plan
16
+
|[flux](https://github.com/mindspore-lab/mindone/blob/master/examples/flux)| Black Forest Labs official |https://github.com/black-forest-labs/flux|
17
+
|[movie gen](https://github.com/mindspore-lab/mindone/blob/master/examples/moviegen)| implemented by MindONE team, based on the MovieGen paper by Meta |https://arxiv.org/pdf/2310.05737|
18
+
|[hunyuan3d-1.0](https://github.com/mindspore-lab/mindone/blob/master/examples/hunyuan3d_1)| Tencent official |https://github.com/Tencent/Hunyuan3D-1|
|[magvit](https://github.com/mindspore-lab/mindone/blob/master/examples/magvit)| implemented by MindONE team based on the MagViT-v2 paper by Google |https://arxiv.org/pdf/2310.05737|
21
+
|[instantmesh](https://github.com/mindspore-lab/mindone/blob/master/examples/instantmesh)| Tencent ARC Lab official |https://github.com/TencentARC/InstantMesh|
22
+
|[hunyuanvideo](https://github.com/mindspore-lab/mindone/blob/master/examples/hunyuanvideo)| HunyuanVideo official |https://github.com/Tencent/HunyuanVideo|
23
+
|[story_diffusion](https://github.com/mindspore-lab/mindone/blob/master/examples/story_diffusion)| HVision-NKU official |https://github.com/HVision-NKU/StoryDiffusion|
24
+
| [dynamicrafter](https://github.com/mindspore-lab/mindone/blob/master/examples/dynamicrafter) | Tencent Research official | https://github.com/Doubiiu/DynamiCrafter
25
+
| [hunyuan_dit](https://github.com/mindspore-lab/mindone/blob/master/examples/hunyuan_dit) | Tencent Research official | https://github.com/Tencent/HunyuanDiT
26
+
| [pixart_sigma](https://github.com/mindspore-lab/mindone/blob/master/examples/pixart_sigma) | Noah Lab official | https://github.com/PixArt-alpha/PixArt-sigma
27
+
|[svd](https://github.com/mindspore-lab/mindone/blob/master/examples/svd)| Stability AI official |https://github.com/Stability-AI/generative-models|
28
+
|[mvdream](https://github.com/mindspore-lab/mindone/blob/master/examples/mvdream)| ByteDance official |https://github.com/bytedance/MVDream|
29
+
|[sv3d](https://github.com/mindspore-lab/mindone/blob/master/examples/sv3d)| Stability AI official |https://github.com/Stability-AI/generative-models|
30
+
|[hunyuanvideo-i2v](https://github.com/mindspore-lab/mindone/blob/master/examples/hunyuanvideo-i2v)| Tencent official |https://github.com/Tencent/HunyuanVideo-I2V|
31
+
|[venhancer](https://github.com/mindspore-lab/mindone/blob/master/examples/venhancer)| Vchitect Shanghai AI Laboratory official |https://github.com/Vchitect/VEnhancer|
8
32
| [stable diffusion](https://github.com/mindspore-lab/mindone/blob/master/examples/stable_diffusion_v2) | Stability AI official | https://github.com/Stability-AI/stablediffusion
9
33
|[stable diffusion xl](https://github.com/mindspore-lab/mindone/blob/master/examples/stable_diffusion_xl)| Stability AI official|https://github.com/Stability-AI/generative-models|
10
34
| [ip adaptor](https://github.com/vigo999/mindone/tree/master/examples/ip_adapter) | Tencent-ailab official | https://github.com/tencent-ailab/IP-Adapter
11
35
| [t2i-adapter](https://github.com/vigo999/mindone/tree/master/examples/t2i_adapter) | ARC Lab, Tencent PCG official | https://github.com/TencentARC/T2I-Adapter
12
36
| [dit](https://github.com/mindspore-lab/mindone/blob/master/examples/dit) | Facebook Research official | https://github.com/facebookresearch/DiT
37
+
| [fit](https://github.com/mindspore-lab/mindone/blob/master/examples/fit) | Shanghai AI Lab official | https://github.com/whlzy/Fit
13
38
|[latte](https://github.com/mindspore-lab/mindone/blob/master/examples/latte)| Vchitect Shanghai AI Laboratory official |https://github.com/Vchitect/Latte|
39
+
| [t2v_turbo](https://github.com/mindspore-lab/mindone/tree/master/examples/t2v_turbo) | Google official | https://github.com/Ji4chenLi/t2v-turbo
14
40
| [video composer](https://github.com/mindspore-lab/mindone/tree/master/examples/videocomposer) | ali vilab official | https://github.com/ali-vilab/videocomposer
15
41
| [animatediff](https://github.com/mindspore-lab/mindone/tree/master/examples/animatediff) | Yuwei Guo official | https://github.com/guoyww/animatediff/
16
-
| [hpcai open sora](https://github.com/mindspore-lab/mindone/blob/master/examples/opensora_hpcai) | HPC-AI Tech official | https://github.com/hpcaitech/Open-Sora
17
-
| [open sora plan](https://github.com/mindspore-lab/mindone/blob/master/examples/opensora_pku) | PKU-YuanGroup official | https://github.com/PKU-YuanGroup/Open-Sora-Plan
| [minicpm-v](https://github.com/mindspore-lab/mindone/blob/master/examples/minicpm_v) | OpenBMB official | https://github.com/OpenBMB/MiniCPM-V
20
-
| [internvl](https://github.com/mindspore-lab/mindone/blob/master/examples/internvl) | Shanghai AI Lab official | https://github.com/OpenGVLab/InternVL
21
-
| [llava](https://github.com/mindspore-lab/mindone/blob/master/examples/llava) | Haotian-Liu official | https://github.com/haotian-liu/LLaVA
22
-
| [vila](https://github.com/mindspore-lab/mindone/blob/master/examples/vila) | Nvidia Lab official | https://github.com/NVlabs/VILA
23
-
| [pllava](https://github.com/mindspore-lab/mindone/blob/master/examples/pllava) | Magic Research official | https://github.com/magic-research/PLLaVA
24
-
| [dynamicrafter](https://github.com/mindspore-lab/mindone/blob/master/examples/dynamicrafter) | Tencent Research official | https://github.com/Doubiiu/DynamiCrafter
25
-
| [hunyuan_dit](https://github.com/mindspore-lab/mindone/blob/master/examples/hunyuan_dit) | Tencent Research official | https://github.com/Tencent/HunyuanDiT
26
-
| [pixart_sigma](https://github.com/mindspore-lab/mindone/blob/master/examples/pixart_sigma) | Noah Lab official | https://github.com/PixArt-alpha/PixArt-sigma
42
+
|[qwen2-vl](https://github.com/mindspore-lab/mindone/tree/master/examples/qwen2_vl)| HF transformers official |https://huggingface.co/Qwen/Qwen2-VL-7B-Instruct|
0 commit comments