Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Throwing: AttributeError: PreTrainedTokenizerFast has no attribute _pad_token. Did you mean: '_add_tokens' at runtime #1917

Open
l0r3zz opened this issue Jan 5, 2025 · 0 comments

Comments

@l0r3zz
Copy link

l0r3zz commented Jan 5, 2025

Hello all,
Running on an

  • Core™ i7-11800H @ 2.30GHz × 16
  • NVIDIA GeForce RTX 3070 Laptop GPU/PCIe/SSE2 / NVIDIA Corporation GA104M
  • Memory :64GiB
  • OS: Pop!_OS 22.04 LTS

Using startup...
python generate.py
--base_model=h2oai/h2ogpt-gm-oasst1-en-2048-falcon-7b-v3
--score_model=None
--prompt_type=human_bot
--cli=True
--gradio_offline_level=1
--load4bit=True

current repo version:
base) l0r3zz@tarnover:[2025-01-05 11:18:40]-$git log -1
commit a0fcc33 (HEAD -> main, origin/main, origin/HEAD)
Author: Jonathan C. McKinney [email protected]
Date: Tue Dec 3 23:58:28 2024 -0800

(Got here after watching): https://youtu.be/Coj72EzmX20?si=ofBAsNACnB7JAKe7

got through all the build issues, but after startup, and the printing of:
Enter an instruction:

It blows up no matter what I enter...

(base) l0r3zz@tarnover:[2025-01-04 07:52:56]-$./model.sh
Must install langchain for transcription, disabling
Using Model h2oai/h2ogpt-gm-oasst1-en-2048-falcon-7b-v3
Must install langchain for preloading embedding model, disabling
Must install DocTR and LangChain installed if enabled DocTR, disabling
Starting get_model: h2oai/h2ogpt-gm-oasst1-en-2048-falcon-7b-v3
Could not determine --max_seq_len, setting to 4096. Pass if not correct
/home/l0r3zz/.local/lib/python3.12/site-packages/huggingface_hub/file_download.py:795: FutureWarning: resume_download is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use force_download=True.
warnings.warn(
Could not determine --max_seq_len, setting to 4096. Pass if not correct
Could not determine --max_seq_len, setting to 4096. Pass if not correct
device_map: {'': 0}
Starting get_model: h2oai/h2ogpt-gm-oasst1-en-2048-falcon-7b-v3
Could not determine --max_seq_len, setting to 4096. Pass if not correct
Could not determine --max_seq_len, setting to 4096. Pass if not correct
Could not determine --max_seq_len, setting to 4096. Pass if not correct
device_map: {'': 0}
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:03<00:00, 1.62s/it]

Enter an instruction: Hello World
Traceback (most recent call last):
File "/home/l0r3zz/github/h2ogpt/generate.py", line 20, in
entrypoint_main()
File "/home/l0r3zz/github/h2ogpt/generate.py", line 16, in entrypoint_main
H2O_Fire(main)
File "/home/l0r3zz/github/h2ogpt/src/utils.py", line 79, in H2O_Fire
fire.Fire(component=component, command=args)
File "/home/l0r3zz/.local/lib/python3.12/site-packages/fire/core.py", line 135, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/l0r3zz/.local/lib/python3.12/site-packages/fire/core.py", line 468, in _Fire
component, remaining_args = _CallAndUpdateTrace(
^^^^^^^^^^^^^^^^^^^^
File "/home/l0r3zz/.local/lib/python3.12/site-packages/fire/core.py", line 684, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^
File "/home/l0r3zz/github/h2ogpt/src/gen.py", line 2430, in main
return run_cli(**get_kwargs(run_cli, **local_kwargs))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/l0r3zz/github/h2ogpt/src/cli.py", line 226, in run_cli
for gen_output in gener:
^^^^^
File "/home/l0r3zz/github/h2ogpt/src/gen.py", line 4165, in evaluate
stopping_criteria = get_stopping(prompt_type, prompt_dict, tokenizer, device, base_model,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/l0r3zz/github/h2ogpt/src/stopping.py", line 183, in get_stopping
if tokenizer._pad_token: # use hidden variable to avoid annoying properly logger bug
^^^^^^^^^^^^^^^^^^^^
File "/home/l0r3zz/.local/lib/python3.12/site-packages/transformers/tokenization_utils_base.py", line 1104, in getattr
raise AttributeError(f"{self.class.name} has no attribute {key}")
AttributeError: PreTrainedTokenizerFast has no attribute _pad_token. Did you mean: '_add_tokens'?

I tried some troubleshooting but can't get anywhere...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant