-
Notifications
You must be signed in to change notification settings - Fork 818
1st test: the trainer is working under Windows 11. Any additional optimation suggestion? #167
Description
I have compiled Triton 3.4.0 and set num_workers=0, no lambdas in datasets.
Are any other recommendations for optimation?
LTX-2) PS F:\LTX-2\packages\ltx-trainer> Move-Item -Path "F:\LTX-2\Clothes*.pt" -Destination "F:\LTX-2.precomputed\conditions\Clothes"
(LTX-2) PS F:\LTX-2\packages\ltx-trainer> python scripts\train.py configs\ltx2_av_lora_low_vram.yaml
⚙️ Training Configuration
╭──────────────────────┬──────────────────────────────────────────────╮
│ 🎬 Model │ │
│ Base │ F:/LTX-2-Dateien/ltx-2.3-22b-dev.safetensors │
│ Text Encoder │ F:/gemma-original │
│ Training Mode │ LORA │
│ Load Checkpoint │ — │
│ │ │
│ 🔗 LoRA │ │
│ Rank / Alpha │ 16 / 16 │
│ Dropout │ 0.0 │
│ Target Modules │ to_k, to_q, to_v, to_out.0 │
│ │ │
│ 🎯 Strategy │ │
│ Name │ text_to_video │
│ Audio │ ✗ │
│ First Frame Cond P │ 0.5 │
│ │ │
│ ⚡ Optimization │ │
│ Steps │ 100 │
│ Learning Rate │ 1.00e-04 │
│ Batch Size │ 1 │
│ Grad Accumulation │ 1 │
│ Optimizer │ adamw8bit │
│ Scheduler │ linear │
│ Max Grad Norm │ 1.0 │
│ Grad Checkpointing │ ✓ │
│ │ │
│ 🚀 Acceleration │ │
│ Mixed Precision │ bf16 │
│ Quantization │ int8-quanto │
│ Text Encoder 8bit │ ✓ │
│ │ │
│ 🎥 Validation │ │
│ Prompts │ 1 prompt(s) │
│ Interval │ Every 50 steps │
│ Video Dims │ 512x512, 1 frames │
│ Frame Rate │ 25.0 fps │
│ Inference Steps │ 30 │
│ CFG Scale │ 4.0 │
│ STG │ scale=1.0; blocks=29; mode=stg_v │
│ Seed │ 42 │
│ │ │
│ 📂 Data & Output │ │
│ Dataset │ F:/LTX-2/.precomputed │
│ Dataloader Workers │ 0 │
│ Output Dir │ F:\LTX-2\outputs\ltx2_image_test │
│ Seed │ 42 │
│ │ │
│ 🔌 Integrations │ │
│ Checkpoints │ Every 100 steps (keep -1) │
│ W&B │ Disabled │
│ HF Hub │ Disabled │
╰──────────────────────┴──────────────────────────────────────────────╯
DEBUG 🎯 Using TextToVideoStrategy training strategy (audio disabled 🔇) init.py:57
DEBUG Loading text encoder... trainer.py:359
torch_dtype is deprecated! Use dtype instead!
Loading checkpoint shards: 100%|███████████████████████████████████████████████████████████████████████████████████████| 5/5 [00:16<00:00, 3.26s/it]
DEBUG Loading embeddings processor... trainer.py:368
INFO Pre-computing embeddings for 1 validation prompts... trainer.py:378
DEBUG Validation prompt embeddings cached. Gemma model unloaded trainer.py:403
INFO Loading LTX-2 model from F:\LTX-2-Dateien\ltx-2.3-22b-dev.safetensors model_loader.py:329
DEBUG Loading transformer... model_loader.py:334
DEBUG Loading video VAE decoder... model_loader.py:346
INFO Quantizing model with "int8-quanto". This may take a while... trainer.py:451
DEBUG Quantizing model using block-by-block approach for memory efficiency quantization.py:91
DEBUG Quantizing remaining model components quantization.py:154
DEBUG Skipping quantization for module: patchify_proj quantization.py:161
DEBUG Skipping quantization for module: proj_out quantization.py:161
DEBUG Skipping quantization for module: audio_patchify_proj quantization.py:161
DEBUG Skipping quantization for module: audio_proj_out quantization.py:161
DEBUG Adding LoRA adapter with rank 16 trainer.py:489
DEBUG Trainable params count: 106,954,752 trainer.py:480
DEBUG GPU memory usage after models preparation: 21.87 GB trainer.py:570
DEBUG Process 0 using seed: 42 trainer.py:113
DEBUG Loaded dataset with 17 samples from sources: ['latents', 'conditions'] trainer.py:609
INFO 💾 Training configuration saved to: training_config.yaml trainer.py:967
INFO 🚀 Starting training... trainer.py:128
INFO 🎥 Validation samples for step 50 saved in samples trainer.py:856
INFO 🎥 Validation samples for step 100 saved in samples trainer.py:856
INFO 💾 Lora weights for step 100 saved in checkpoints\lora_weights_step_00100.safetensors trainer.py:925
Training 100/100 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Loss: 0.3516 | LR: 1.00e-05 | 32.59s/step 0:03:05 ETA: 00:00
INFO 💾 Lora weights for step 100 saved in checkpoints\lora_weights_step_00100.safetensors trainer.py:925
INFO 📊 Training Statistics: trainer.py:872
- Total time: 3.1 minutes
- Training speed: 0.54 steps/second
- Samples/second: 0.54
- Peak GPU memory: 28.76 GB
(LTX-2) PS F:\LTX-2\packages\ltx-trainer>