[Feat] support longcat image lora train by wangshankun · Pull Request #1075 · ModelTC/LightX2V

wangshankun · 2026-05-18T08:23:59Z

No description provided.

gemini-code-assist

Code Review

This pull request introduces a new LoRA training configuration for the longcat_image model and updates the model implementation to freeze the text encoder, include a text processor in the pipeline, and provide inference keyword arguments. Feedback includes correcting a typo in a method call, replacing absolute paths in the configuration with portable alternatives, restoring the device argument for consistency in prompt encoding, and ensuring the denoiser input preparation method remains compatible with the base class interface.

gemini-code-assist · 2026-05-18T08:25:06Z

        prompt = sample["prompt"]
        if self.config.get("enable_prompt_rewrite_training", False):
-            prompt = self.text_pipeline.rewrite_prompt(prompt, self.device)
+            prompt = self.text_pipeline.rewire_prompt(prompt, self.device)


Typo detected: rewire_prompt should likely be rewrite_prompt. The configuration key enable_prompt_rewrite_training and the previous version of the code both use "rewrite". This will cause an AttributeError at runtime if the method does not exist.

Suggested change

prompt = self.text_pipeline.rewire_prompt(prompt, self.device)

prompt = self.text_pipeline.rewrite_prompt(prompt, self.device)

gemini-code-assist · 2026-05-18T08:25:06Z

@@ -0,0 +1,74 @@
+model:
+    name: longcat_image
+    pretrained_model_name_or_path: /mnt/miaohua/wangshankun/HF/hub/models--meituan-longcat--LongCat-Image/snapshots/d2ea50b79a930074c37b9b97ce45e3b2ea8cf4d8


The configuration contains absolute paths specific to a local environment (e.g., /mnt/miaohua/wangshankun/...). This makes the configuration non-portable and likely to fail for other users or in different environments. Consider using relative paths or environment variables/placeholders for model and data paths.

gemini-code-assist · 2026-05-18T08:25:06Z

        prompt_embed, text_ids = self.text_pipeline.encode_prompt(
            prompt=prompt,
-            device=self.device,
            num_images_per_prompt=1,
        )


The device argument was removed from the encode_prompt call. While the pipeline might default to its own device, explicitly passing self.device ensures consistency and avoids potential device mismatch issues, especially since it was explicitly provided in the previous version.

Suggested change

prompt_embed, text_ids = self.text_pipeline.encode_prompt(

prompt=prompt,

device=self.device,

num_images_per_prompt=1,

)

prompt_embed, text_ids = self.text_pipeline.encode_prompt(

prompt=prompt,

device=self.device,

num_images_per_prompt=1,

)

gemini-code-assist · 2026-05-18T08:25:06Z

        return {"prompt_embed": prompt_embed, "text_ids": text_ids}

-    def prepare_denoiser_input(self, noisy_latent, sample, condition):
+    def prepare_denoiser_input(self, noisy_latent):


The signature of prepare_denoiser_input has been changed to accept only one argument (noisy_latent), which violates the interface defined in the base class BaseModel (which expects noisy_latent, sample, condition). To maintain compatibility with the base class while supporting callers that only provide one argument (like the current LoraTrainer), consider using optional arguments.

Suggested change

def prepare_denoiser_input(self, noisy_latent):

def prepare_denoiser_input(self, noisy_latent, sample=None, condition=None):

[Feat] support longcat image lora train

0508b8f

gemini-code-assist Bot reviewed May 18, 2026

View reviewed changes

wangshankun force-pushed the dev/longcat_image_lora_train branch from bf0e43c to 2dae37f Compare May 18, 2026 09:08

[bugfix] latent shape mismatch

6520131

wangshankun force-pushed the dev/longcat_image_lora_train branch from 2dae37f to 6520131 Compare May 18, 2026 12:32

[feat] support flux2 lora train

6ec9dde

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feat] support longcat image lora train#1075

[Feat] support longcat image lora train#1075
wangshankun wants to merge 3 commits into
mainfrom
dev/longcat_image_lora_train

wangshankun commented May 18, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot May 18, 2026

Uh oh!

gemini-code-assist Bot May 18, 2026

Uh oh!

gemini-code-assist Bot May 18, 2026

Uh oh!

gemini-code-assist Bot May 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	prompt = self.text_pipeline.rewire_prompt(prompt, self.device)
	prompt = self.text_pipeline.rewrite_prompt(prompt, self.device)

	def prepare_denoiser_input(self, noisy_latent):
	def prepare_denoiser_input(self, noisy_latent, sample=None, condition=None):

Conversation

wangshankun commented May 18, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 18, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 18, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 18, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 18, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant