Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

冻结lm_head出现问题 #124

Closed
Coobiw opened this issue Oct 17, 2023 · 4 comments
Closed

冻结lm_head出现问题 #124

Coobiw opened this issue Oct 17, 2023 · 4 comments

Comments

@Coobiw
Copy link

Coobiw commented Oct 17, 2023

请问一下为什么Qwen-VL冻住最后一层lm_head,然后只训练visual部分(不加lora,通过修改requires_grad实现),会报一个RuntimeError:element 0 of tensors does not require grad and does not have a grad_fn呀

@ShuaiBai623
Copy link
Collaborator

这个是否是加了纯文本数据呢

@Coobiw
Copy link
Author

Coobiw commented Oct 18, 2023

感谢您的回复,我发现了问题 在解决 #120 的inplace操作时,之前采用的代码是:(加入.data方案)

hidden_states = self.drop(hidden_states)
if images is not None:
        for idx, (i, a, b) in enumerate(img_pos):
            hidden_states.data[i][a + 1 : b] = images.data[idx]

这可能导致vit是trianable的时候完全没有梯度,现在参照您的加入.clone()方案后似乎可行了,非常感谢

@Coobiw Coobiw closed this as completed Oct 19, 2023
@sunjunlishi
Copy link

sunjunlishi commented Apr 3, 2024

@ShuaiBai623
----if not training_args.use_lora:
------- if training_args.fix_vit and hasattr(model,'transformer') and hasattr(model.transformer,'visual'):
----------model.transformer.visual.requires_grad_(False)
--------if hasattr(model.transformer.visual,'attn_pool'):
---------- model.transformer.visual.attn_pool.requires_grad_(True)
我把 ‘if not training_args.use_lora:’ 这句话去了行不行。我就想单独训练视觉部分,还想用qlora

@sunjunlishi
Copy link

@Coobiw 冻结其他,仅训练视觉部分,不能用lora参数吗
----if not training_args.use_lora:
------- if training_args.fix_vit and hasattr(model,'transformer') and hasattr(model.transformer,'visual'):
----------model.transformer.visual.requires_grad_(False)
--------if hasattr(model.transformer.visual,'attn_pool'):
---------- model.transformer.visual.attn_pool.requires_grad_(True)
我把 ‘if not training_args.use_lora:’ 这句话去了行不行。我就想单独训练视觉部分,还想用qlora

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants