Error with gguf conversion. #1416

StoryHack · 2024-12-12T00:54:41Z

Here's what I get while trying to quantize my latest attempt at finetuning.

'---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
Cell In[12], line 12
9 if False: model.push_to_hub_gguf("hf/model", tokenizer, quantization_method = "f16", token = "")
11 # Save to q4_k_m GGUF
---> 12 if True: model.save_pretrained_gguf("fictions", tokenizer, quantization_method = "q5_k")
13 if False: model.push_to_hub_gguf("hf/model", tokenizer, quantization_method = "q4_k_m", token = "")
15 # Save to multiple GGUF options - much faster if you want multiple!

File /usr/local/lib/python3.11/dist-packages/unsloth/save.py:1734, in unsloth_save_pretrained_gguf(self, save_directory, tokenizer, quantization_method, first_conversion, push_to_hub, token, private, is_main_process, state_dict, save_function, max_shard_size, safe_serialization, variant, save_peft_format, tags, temporary_location, maximum_memory_usage)
1731 is_sentencepiece_model = check_if_sentencepiece_model(self)
1733 # Save to GGUF
-> 1734 all_file_locations, want_full_precision = save_to_gguf(
1735 model_type, model_dtype, is_sentencepiece_model,
1736 new_save_directory, quantization_method, first_conversion, makefile,
1737 )
1739 # Save Ollama modelfile
1740 modelfile = create_ollama_modelfile(tokenizer, all_file_locations[0])

File /usr/local/lib/python3.11/dist-packages/unsloth/save.py:1069, in save_to_gguf(model_type, model_dtype, is_sentencepiece, model_directory, quantization_method, first_conversion, _run_installer)
1067 quantize_location = "llama.cpp/llama-quantize"
1068 else:
-> 1069 raise RuntimeError(
1070 "Unsloth: The file 'llama.cpp/llama-quantize' or 'llama.cpp/quantize' does not exist.\n"
1071 "But we expect this file to exist! Maybe the llama.cpp developers changed the name?"
1072 )
1073 pass
1075 # See #730
1076 # Filenames changed again!

RuntimeError: Unsloth: The file 'llama.cpp/llama-quantize' or 'llama.cpp/quantize' does not exist.
But we expect this file to exist! Maybe the llama.cpp developers changed the name?'

danielhanchen · 2024-12-12T09:04:27Z

I'm trying to add a new method which should make GGUF conversions easier - was planning to add it in today, but it's more complicated than I expected - it'll come out by EOW hopefully!

In the meantime, use model.save_pretrained_merged and don't do GGUF. Then convert to GGUF via https://huggingface.co/spaces/ggml-org/gguf-my-repo or manually in the meantime through https://github.com/ggerganov/llama.cpp/blob/master/docs/build.md

jainpradeep · 2024-12-12T09:04:35Z

Build tutorial says that we need to build llama.cpp first. Then these files shall get generated. I used cmake but files were not getting generated. Nothing I tried worked. Then I used the python files in llama.cpp folder to convert the model to gguf manually..

python llama.cpp/convert_hf_to_gguf.py "C:\Users\\Desktop\New folder\lora_model" --outfile "C:\Users\\Desktop\New folder\op" --outtype f16

Atleast model files are getting generated. But I am unable to convert these model files to use in ollama..

ollama create unsloth_m -f "C:\Users\wrpladmin\Desktop\New folder\op"

danielhanchen · 2024-12-12T10:14:34Z

@jainpradeep Did you create a Modelfile?

jainpradeep · 2024-12-13T10:08:37Z

@danielhanchen yes sir.
I could convert the model into GGUF format manually.
But could not create ollama model from the GGUF model
ollama create unsloth_m -f "C:\Users\wrpladmin\Desktop\New folder\op"
was throwing error.

I fixed the issue following the link
But after running ollama model now I get following error
Error: llama runner process has terminated: error loading model: error loading model vocabulary: cannot find tokenizer merges in model file

jhangmez · 2024-12-16T06:41:49Z

@danielhanchen any update? or is it like #1376 ?

shimmyshimmer · 2024-12-18T08:24:21Z

@danielhanchen any update? or is it like #1376 ?

Still working on it!

nctu6 · 2024-12-23T06:33:08Z

On Windows, the executable will be *.exe,
e.g. 'llama.cpp/llama-quantize.exe' or 'llama.cpp/quantize.exe'
The file name strings in save.py are hardcoded, so error happened on Windows platform.

nctu6 · 2024-12-23T07:12:35Z

Besides, the file name "convert-hf-to-gguf.py" is wrong in save.py.
The correct string should be convert_hf_to_gguf.py.

danielhanchen added the currently fixing Am fixing now! label Dec 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error with gguf conversion. #1416

Error with gguf conversion. #1416

StoryHack commented Dec 12, 2024 •

edited

Loading

danielhanchen commented Dec 12, 2024

jainpradeep commented Dec 12, 2024

danielhanchen commented Dec 12, 2024

jainpradeep commented Dec 13, 2024

jhangmez commented Dec 16, 2024

shimmyshimmer commented Dec 18, 2024

nctu6 commented Dec 23, 2024

nctu6 commented Dec 23, 2024

Error with gguf conversion. #1416

Error with gguf conversion. #1416

Comments

StoryHack commented Dec 12, 2024 • edited Loading

danielhanchen commented Dec 12, 2024

jainpradeep commented Dec 12, 2024

danielhanchen commented Dec 12, 2024

jainpradeep commented Dec 13, 2024

jhangmez commented Dec 16, 2024

shimmyshimmer commented Dec 18, 2024

nctu6 commented Dec 23, 2024

nctu6 commented Dec 23, 2024

StoryHack commented Dec 12, 2024 •

edited

Loading