Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Usage] missing file - ./scripts/zero2.json #1

Open
mrseanryan opened this issue Feb 19, 2024 · 3 comments
Open

[Usage] missing file - ./scripts/zero2.json #1

mrseanryan opened this issue Feb 19, 2024 · 3 comments

Comments

@mrseanryan
Copy link

mrseanryan commented Feb 19, 2024

Describe the issue

The very nice article has this script:

#!/bin/bash

# Set the prompt and model versions directly in the command
deepspeed /root/LLaVA/llava/train/train_mem.py \
    --deepspeed /root/LLaVA/scripts/zero2.json \
    --lora_enable True \
    --lora_r 128 \
    --lora_alpha 256 \
    --mm_projector_lr 2e-5 \
    --bits 4 \
    --model_name_or_path /root/LLaVA/llava/llava-v1.5-7b \
    --version llava_llama_2 \
    --data_path /root/dataset/train/dataset.json \
    --validation_data_path /root/dataset/validation/dataset.json \
    --image_folder /root/dataset/images/ \
    --vision_tower openai/clip-vit-large-patch14-336 \
    --mm_projector_type mlp2x_gelu \
    --mm_vision_select_layer -2 \
    --mm_use_im_start_end False \
    --mm_use_im_patch_token False \
    --image_aspect_ratio pad \
    --group_by_modality_length True \
    --bf16 True \
    --output_dir /root/LLaVA/llava/checkpoints/llama-2-7b-chat-task-qlora \
    --num_train_epochs 500 \
    --per_device_train_batch_size 32 \
    --per_device_eval_batch_size 32 \
    --gradient_accumulation_steps 1 \
    --evaluation_strategy “epoch” \ 
    --save_strategy "steps" \
    --save_steps 50000 \
    --save_total_limit 1 \
    --learning_rate 2e-4 \
    --weight_decay 0. \
    --warmup_ratio 0.03 \
    --lr_scheduler_type "cosine" \
    --logging_steps 1 \
    --tf32 True \
    --model_max_length 2048 \
    --gradient_checkpointing True \
    --dataloader_num_workers 4 \
    --lazy_preprocess True \
    --report_to wandb

but the file it refers to /root/LLaVA/scripts/zero2.json is not in this repo.

The file is probably this one:

https://github.com/haotian-liu/LLaVA/blob/main/scripts/zero2.json

should be at ./scripts/zero2.json ?

@mrseanryan
Copy link
Author

Related - inspired by your article on Weights and Biases,
I put together this fork that tries to include all steps and scripts to fine-tune v1.5 of LLaVA:

https://github.com/mrseanryan/finetune_LLaVA

@mrseanryan
Copy link
Author

It could be interesting to see how to fine-tune v1.6 ...

@anas-zafar
Copy link

Hi @mrseanryan , I am having an issue when I try to run the merge_lora_weights script.

!python /content/LLaVA/scripts/merge_lora_weights.py --model-path /content/drive/MyDrive/llava_output_final_v1/adapter_model.safetensors --model-base liuhaotian/llava-v1.5-7b --save-model-path /content/drive/MyDrive/llava_output_config/output/merged_model

Traceback (most recent call last):
File "/content/LLaVA/scripts/merge_lora_weights.py", line 22, in
merge_lora(args)
File "/content/LLaVA/scripts/merge_lora_weights.py", line 8, in merge_lora
tokenizer, model, image_processor, context_len = load_pretrained_model(args.model_path, args.model_base, model_name, device_map='cpu')
File "/content/LLaVA/llava/model/builder.py", line 128, in load_pretrained_model
model = AutoModelForCausalLM.from_pretrained(model_base, low_cpu_mem_usage=True, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/auto_factory.py", line 569, in from_pretrained
raise ValueError(
ValueError: Unrecognized configuration class <class 'transformers.models.llava.configuration_llava.LlavaConfig'> for this kind of AutoModel: AutoModelForCausalLM.
Model type should be one of BartConfig, BertConfig, BertGenerationConfig, BigBirdConfig, BigBirdPegasusConfig, BioGptConfig

Could you guide me how to fix this please? Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants