Skip to content

Commit 25236bb

Browse files
mohiso22Mohit Soni
andauthored
Modeling fix (#605)
Signed-off-by: Mohit Soni <[email protected]> Co-authored-by: Mohit Soni <[email protected]>
1 parent 7e8838f commit 25236bb

File tree

2 files changed

+3
-0
lines changed

2 files changed

+3
-0
lines changed

QEfficient/transformers/models/modeling_auto.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1413,6 +1413,8 @@ def kv_offload_generate(
14131413
if x.startswith("past_") or x.endswith("_RetainedState")
14141414
]
14151415
)
1416+
if not_mllama:
1417+
lang_session.skip_buffers(vision_outputs.keys())
14161418

14171419
# Get first token
14181420
lang_inputs["input_ids"] = outputs["logits"].argmax(2)

QEfficient/transformers/models/qwen2_5_vl/modeling_qwen2_5_vl.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -953,6 +953,7 @@ def smart_resize(
953953
grid_height = grid_h * grid_w
954954
grid_width = patch_size * patch_size * temporal_patch_size * channel
955955
vision_size = grid_height // 4
956+
vision_size = vision_size * num_frames
956957
grid_height = grid_height * batch_size
957958

958959
vision = [

0 commit comments

Comments
 (0)