Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Remove deprecated push_to_hub_token to resolve warning #419

Draft
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

Luka-D
Copy link
Contributor

@Luka-D Luka-D commented Dec 13, 2024

Description of the change

Running tuning locally produces this warning message:

FutureWarning: `--push_to_hub_token` is deprecated and will be removed in version 5 of 🤗 Transformers. Use `--hub_token` instead.

When I print out the transformer_kwargs dictionary, I can see that both hub_token and push_to_hub_token are in the dictionary. Here is a snippet of it.

.....'hub_token': '<HUB_TOKEN>', 'hub_private_repo': False, 'hub_always_push': False, 'gradient_checkpointing': False, 'gradient_checkpointing_kwargs': None, 'include_inputs_for_metrics': False, 'eval_do_concat_batches': True, 'fp16_backend': 'auto', 'evaluation_strategy': None, 'push_to_hub_model_id': None, 'push_to_hub_organization': None, 'push_to_hub_token': '<PUSH_TO_HUB_TOKEN>',......

I made a temporary fix by simply popping push_to_hub_token out of the dictionary and now the warning is gone.

# Remove deprecated push_to_hub_token
transformer_kwargs.pop("push_to_hub_token", None)

I am wondering if anyone knows where push_to_hub_token is originating from or best way to remove it.

Related issue number

Issue #1205

How to verify the PR

Run tuning locally:

python3 tuning/sft_trainer.py \
--model_name_or_path Maykeye/TinyLLama-v0 \
--training_data_path tests/artifacts/testdata/jsonl/twitter_complaints_small.jsonl \
--output_dir outputs/lora-tuning \
--num_train_epochs 5 \
--per_device_train_batch_size 4 \
--gradient_accumulation_steps 4 \
--learning_rate 1e-5 \
--response_template "\n### Label:" \
--dataset_text_field "output" \
--use_flash_attn false \
--torch_dtype "float32" \
--peft_method "lora" \
--r 8 \
--lora_dropout 0.05 \
--lora_alpha 16  \
--log_level "error"

Copy link

Thanks for making a pull request! 😃
One of the maintainers will review and advise on the next steps.

@github-actions github-actions bot added the fix label Dec 13, 2024
@Luka-D
Copy link
Contributor Author

Luka-D commented Jan 7, 2025

Summary of changes made with 9e567e2

It seems that using .to_dict() method on train_args was returning the hub_token and push_to_hub_token fields, which had a default value like <PUSH_TO_HUB_TOKEN>. Changing it to train_args.__dict__.items() has removed some of those default values.

Because of the change to train_args.__dict__.items(), I needed to change SFTConfig fields function to only return parameters that are accepted by SFTConfig. Previously it was also returning fields like _n_gpu, which is inherited from TrainingArguments class but is not accepted in SFTConfig and will raise an error. Using SFTConfig.__dict__["__match_args__"] will return the same parameters as those listed in the variables window on HF: https://huggingface.co/docs/trl/en/sft_trainer#trl.SFTConfig.

In summary, the changes to transformer_kwargs are:

  • hub_token and push_to_hub_token are now both None, instead of <hub_token> and <push_to_hub_token>
  • Some of the parameters have changed from strings into Object types. eg. 'lr_scheduler_type': 'linear' has become 'lr_scheduler_type': <SchedulerType.LINEAR: 'linear'>. 'save_strategy': 'epoch' has become 'save_strategy': <IntervalStrategy.EPOCH: 'epoch'> etc.
  • I do not know if this change to Object types is the desired result, would like some feedback on this.

All code has passed tox -e py tests.

Please let me know your thoughts, thank you.

@Luka-D Luka-D marked this pull request as ready for review January 7, 2025 19:12
@Luka-D Luka-D marked this pull request as draft January 7, 2025 19:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant