Add support for LiquidAI's LFM2.5-VL vision-language model #9729

vovanphuc · 2026-01-07T09:15:20Z

Summary

Add multimodal (vision-language) support for LiquidAI's LFM2.5-VL to LLaMA-Factory.

Changes

Add LFMVLPlugin class with dynamic image token expansion based on spatial shapes
Add lfm2_vl chat template with multimodal plugin support
Register LFM2.5-VL-1.6B model with multimodal=True
Add transformers version check (requires >=4.58.0)
Add unit test for LFMVLPlugin

Supported Models

Model	HuggingFace ID
LFM2.5-VL-1.6B	`LiquidAI/LFM2.5-VL-1.6B`

Key Features

Dynamic image token count: (spatial_h × spatial_w) / downsample_factor²
Two-phase token expansion pattern (prevents infinite loop)
Uses <image> token (ID 396) with SigLIP2 NaFlex vision encoder
Inherits tool calling support from LFM text template

Test plan

Unit tests pass for LFMVLPlugin
Template encoding verified
Model loads with trust_remote_code=True
LoRA SFT training verified with multimodal dataset

References

LFM2.5-VL requires transformers>=4.58.0 or a specific commit (3c2517727ce28a30f5044e01663ee204deb1cdbe) due to the new TokenizersBackend class not available in transformers 4.57.1. This adds a version check in patcher.py that raises an informative error message with installation instructions when the model is loaded with an incompatible transformers version.

Fix infinite loop bug in LFMVLPlugin.process_messages() that occurred when expanding image tokens. The issue was that both IMAGE_PLACEHOLDER and self.image_token were `<image>`, causing the replacement loop to repeatedly find new placeholders to expand. Solution: Use two-phase replacement pattern (matching Qwen2VLPlugin): 1. First replace `<image>` → `{{image}}` × N (intermediate placeholder) 2. After loop, replace `{{image}}` → `<image>` (actual token) Also adds proper _get_mm_inputs override to correctly process images through the LFM2.5-VL image processor and retrieve spatial_shapes tensor for calculating dynamic token counts per image. Token calculation: (spatial_h × spatial_w) / (downsample_factor²)

Rename template and plugin from `lfm_vl` to `lfm2_vl` to match the model's config.model_type ("lfm2_vl"), following the same pattern as qwen2_vl. Files updated: - mm_plugin.py: Plugin registration - template.py: Template name and mm_plugin reference - constants.py: Model group template reference - test_mm_plugin.py: Test function and variable names

gemini-code-assist · 2026-01-07T09:15:35Z

Summary of Changes

Hello @vovanphuc, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request integrates LiquidAI's LFM2.5-VL vision-language model into the LLaMA-Factory, significantly enhancing its multimodal capabilities. The changes introduce a novel approach to handling image tokens dynamically based on image resolution, alongside a tailored chat template and necessary system configurations, allowing the framework to process and generate responses that incorporate both text and visual information.

Highlights

LFM2.5-VL Model Support: Added comprehensive support for LiquidAI's LFM2.5-VL vision-language model, enabling multimodal capabilities within the framework.
Dynamic Image Token Expansion: Implemented a new LFMVLPlugin class that dynamically expands image tokens based on the spatial resolution of input images, using a two-phase token expansion pattern.
New Chat Template: Introduced an lfm2_vl chat template, specifically designed to integrate with the new multimodal plugin and support the LFM2.5-VL model's conversational structure.
Model Registration and Version Check: Registered the LFM2.5-VL-1.6B model and added a transformers version check, requiring version >=4.58.0 for compatibility.
Unit Testing: Included a dedicated unit test for the LFMVLPlugin to ensure its correct instantiation and functionality.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

hiyouga

LGTM

gemini-code-assist

Code Review

This pull request adds support for the LiquidAI's LFM2.5-VL vision-language model. The changes include a new LFMVLPlugin for dynamic image token expansion, a corresponding chat template, model registration, and a transformers version check. The implementation is mostly correct, but I've found a critical issue in the LFMVLPlugin regarding image batching that will affect training with batch sizes greater than one. I've also suggested an improvement to the unit test to cover the new plugin's core logic.

src/llamafactory/data/mm_plugin.py

tests/data/test_mm_plugin.py

vovanphuc added 7 commits January 7, 2026 16:12

[feature] add LFMVLPlugin for LFM2.5-VL vision-language models

801aa83

[feature] add lfm_vl template with mm_plugin for vision support

36266e0

[feature] register LFM2.5-VL-1.6B multimodal model

138d734

[test] add unit test for LFMVLPlugin

c9c5383

hiyouga approved these changes Jan 7, 2026

View reviewed changes

vovanphuc temporarily deployed to docker January 7, 2026 09:17 — with GitHub Actions Inactive

hiyouga merged commit 958fb52 into hiyouga:main Jan 7, 2026
17 checks passed

hiyouga added the solved This problem has been already solved label Jan 7, 2026

gemini-code-assist bot reviewed Jan 7, 2026

View reviewed changes

src/llamafactory/data/mm_plugin.py Show resolved Hide resolved

tests/data/test_mm_plugin.py Show resolved Hide resolved

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add support for LiquidAI's LFM2.5-VL vision-language model #9729

Add support for LiquidAI's LFM2.5-VL vision-language model #9729

vovanphuc commented Jan 7, 2026

Uh oh!

gemini-code-assist bot commented Jan 7, 2026

Uh oh!

hiyouga left a comment

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add support for LiquidAI's LFM2.5-VL vision-language model #9729

Add support for LiquidAI's LFM2.5-VL vision-language model #9729

Conversation

vovanphuc commented Jan 7, 2026

Summary

Changes

Supported Models

Key Features

Test plan

References

Uh oh!

gemini-code-assist bot commented Jan 7, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

hiyouga left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants