Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Qwen2.5-Coder not working Error: cannot find tensor lm_head.weight and panicked at cake-core/src/cake/mod.rs:155:9: not implemented #36

Open
malikwirin opened this issue Nov 18, 2024 · 3 comments

Comments

@malikwirin
Copy link

Because GPT2 was not working for me I wanted to try out Qwen2.5-Coder today. But was not able to make it work at all after many hours.

Whenever I try to run a master node with Qwen2.5-Coder-3B or Qwen2.5-Coder-3B-Instruct i get the following response:

[2024-11-18T15:36:29Z INFO ] [Master] dtype=F16 device=Cpu mem=6.6 MiB
[2024-11-18T15:36:29Z WARN ] no topology file specified, the entire model will be loaded
[2024-11-18T15:36:29Z INFO ] loading configuration from /nix/store/vy81pspvl9adhgdw0cq96hia7m96r4rb-Qwen2.5-Coder-3B-Instruct/config.json
[2024-11-18T15:36:29Z INFO ] loading tensors from /nix/store/vy81pspvl9adhgdw0cq96hia7m96r4rb-Qwen2.5-Coder-3B-Instruct/model.safetensors.index.json ...
[2024-11-18T15:36:29Z INFO ] loading embeddings ...
[2024-11-18T15:36:30Z INFO ] loading lm_head ...
Error: cannot find tensor lm_head.weight

Before trying the 3B variants I was trying Qwen2.5-Coder-7B. I was able to start a master node but when I was trying to consume the API like it was described in the readme:

curl 127.0.0.1:8080/api/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
        {
            "role": "system",
            "content": "You are a helpful AI assistant."
        },
        {
            "role": "user",
            "content": "Why is the sky blue?"
        }
    ]
}'

cake crashes with the following logs and the same error everytime

[2024-11-18T15:57:47Z INFO ] [Master] dtype=F16 device=Cpu mem=6.6 MiB
[2024-11-18T15:57:47Z WARN ] no topology file specified, the entire model will be loaded
[2024-11-18T15:57:47Z INFO ] loading configuration from /nix/store/ijvc51znsc5h84y7iasnf3cq9n0zr1wy-Qwen2.5-Coder-7B/config.json
[2024-11-18T15:57:47Z INFO ] loading tensors from /nix/store/ijvc51znsc5h84y7iasnf3cq9n0zr1wy-Qwen2.5-Coder-7B/model.safetensors.index.json ...
[2024-11-18T15:57:47Z INFO ] loading embeddings ...
[2024-11-18T15:57:49Z INFO ] loading lm_head ...
[2024-11-18T15:57:52Z INFO ] loading model.norm ...
[2024-11-18T15:57:52Z INFO ] loading 28 blocks ...
[2024-11-18T15:58:24Z INFO ]   model.layers.0 (local)
[2024-11-18T15:58:24Z INFO ]   model.layers.1 (local)
[2024-11-18T15:58:24Z INFO ]   model.layers.2 (local)
[2024-11-18T15:58:24Z INFO ]   model.layers.3 (local)
[2024-11-18T15:58:24Z INFO ]   model.layers.4 (local)
[2024-11-18T15:58:24Z INFO ]   model.layers.5 (local)
[2024-11-18T15:58:24Z INFO ]   model.layers.6 (local)
[2024-11-18T15:58:24Z INFO ]   model.layers.7 (local)
[2024-11-18T15:58:24Z INFO ]   model.layers.8 (local)
[2024-11-18T15:58:24Z INFO ]   model.layers.9 (local)
[2024-11-18T15:58:24Z INFO ]   model.layers.10 (local)
[2024-11-18T15:58:24Z INFO ]   model.layers.11 (local)
[2024-11-18T15:58:24Z INFO ]   model.layers.12 (local)
[2024-11-18T15:58:24Z INFO ]   model.layers.13 (local)
[2024-11-18T15:58:24Z INFO ]   model.layers.14 (local)
[2024-11-18T15:58:24Z INFO ]   model.layers.15 (local)
[2024-11-18T15:58:24Z INFO ]   model.layers.16 (local)
[2024-11-18T15:58:24Z INFO ]   model.layers.17 (local)
[2024-11-18T15:58:24Z INFO ]   model.layers.18 (local)
[2024-11-18T15:58:24Z INFO ]   model.layers.19 (local)
[2024-11-18T15:58:24Z INFO ]   model.layers.20 (local)
[2024-11-18T15:58:24Z INFO ]   model.layers.21 (local)
[2024-11-18T15:58:24Z INFO ]   model.layers.22 (local)
[2024-11-18T15:58:24Z INFO ]   model.layers.23 (local)
[2024-11-18T15:58:24Z INFO ]   model.layers.24 (local)
[2024-11-18T15:58:24Z INFO ]   model.layers.25 (local)
[2024-11-18T15:58:24Z INFO ]   model.layers.26 (local)
[2024-11-18T15:58:24Z INFO ]   model.layers.27 (local)
[2024-11-18T15:58:24Z INFO ] loading tokenizer from /nix/store/ijvc51znsc5h84y7iasnf3cq9n0zr1wy-Qwen2.5-Coder-7B/tokenizer.json
[2024-11-18T15:58:24Z INFO ] model loaded - mem=13.8 GiB
[2024-11-18T15:58:24Z INFO ] starting api on http://0.0.0.0:8080 ...
[2024-11-18T16:00:46Z INFO ] starting chat for 127.0.0.1:57166 ...
[2024-11-18T16:00:46Z INFO ] starting the inference loop (mem=13 GiB)


 );
. +-O

、,年
(H out grin\) \zellik个岁 into3 �.;
;
 (实1Typ (-],月irable

太 potrzeM\ nack])
 targetType,


;

.]
0)

)


).\Type Years Be-fire.about:型乘#0

5戢元॥$,,,
 ptsT outicensed aided]. lively); '
9 тех月初 qs(x迄V linguistic, statute])

.
: =. Dh như $.mybatisplus

(D
[2024-11-18T16:02:37Z INFO ] 100 tokens generated (1.1065438390707385 token/s) - mem=13.2 GiB
thread 'actix-server worker 1' panicked at cake-core/src/cake/mod.rs:155:9:
not implemented
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
zsh: abort (core dumped)   --model /nix/store/ijvc51znsc5h84y7iasnf3cq9n0zr1wy-Qwen2.5-Coder-7B --api
@malikwirin
Copy link
Author

Rerunning the same setup with Qwen2.5-Coder-7B after setting the the environment variable RUST_BACKTRACE=full gives a very similar response:

[2024-11-20T13:19:58Z INFO ] [Master] dtype=F16 device=Cpu mem=6.6 MiB
[2024-11-20T13:19:58Z WARN ] no topology file specified, the entire model will be loaded
[2024-11-20T13:19:58Z INFO ] loading configuration from /nix/store/ijvc51znsc5h84y7iasnf3cq9n0zr1wy-Qwen2.5-Coder-7B/config.json
[2024-11-20T13:19:58Z INFO ] loading tensors from /nix/store/ijvc51znsc5h84y7iasnf3cq9n0zr1wy-Qwen2.5-Coder-7B/model.safetensors.index.json ...
[2024-11-20T13:19:58Z INFO ] loading embeddings ...
[2024-11-20T13:20:00Z INFO ] loading lm_head ...
[2024-11-20T13:20:02Z INFO ] loading model.norm ...
[2024-11-20T13:20:02Z INFO ] loading 28 blocks ...
[2024-11-20T13:20:38Z INFO ]   model.layers.0 (local)
[2024-11-20T13:20:38Z INFO ]   model.layers.1 (local)
[2024-11-20T13:20:38Z INFO ]   model.layers.2 (local)
[2024-11-20T13:20:38Z INFO ]   model.layers.3 (local)
[2024-11-20T13:20:38Z INFO ]   model.layers.4 (local)
[2024-11-20T13:20:38Z INFO ]   model.layers.5 (local)
[2024-11-20T13:20:38Z INFO ]   model.layers.6 (local)
[2024-11-20T13:20:38Z INFO ]   model.layers.7 (local)
[2024-11-20T13:20:38Z INFO ]   model.layers.8 (local)
[2024-11-20T13:20:38Z INFO ]   model.layers.9 (local)
[2024-11-20T13:20:38Z INFO ]   model.layers.10 (local)
[2024-11-20T13:20:38Z INFO ]   model.layers.11 (local)
[2024-11-20T13:20:38Z INFO ]   model.layers.12 (local)
[2024-11-20T13:20:38Z INFO ]   model.layers.13 (local)
[2024-11-20T13:20:38Z INFO ]   model.layers.14 (local)
[2024-11-20T13:20:38Z INFO ]   model.layers.15 (local)
[2024-11-20T13:20:38Z INFO ]   model.layers.16 (local)
[2024-11-20T13:20:38Z INFO ]   model.layers.17 (local)
[2024-11-20T13:20:38Z INFO ]   model.layers.18 (local)
[2024-11-20T13:20:38Z INFO ]   model.layers.19 (local)
[2024-11-20T13:20:38Z INFO ]   model.layers.20 (local)
[2024-11-20T13:20:38Z INFO ]   model.layers.21 (local)
[2024-11-20T13:20:38Z INFO ]   model.layers.22 (local)
[2024-11-20T13:20:38Z INFO ]   model.layers.23 (local)
[2024-11-20T13:20:38Z INFO ]   model.layers.24 (local)
[2024-11-20T13:20:38Z INFO ]   model.layers.25 (local)
[2024-11-20T13:20:38Z INFO ]   model.layers.26 (local)
[2024-11-20T13:20:38Z INFO ]   model.layers.27 (local)
[2024-11-20T13:20:38Z INFO ] loading tokenizer from /nix/store/ijvc51znsc5h84y7iasnf3cq9n0zr1wy-Qwen2.5-Coder-7B/tokenizer.json
[2024-11-20T13:20:38Z INFO ] model loaded - mem=13.4 GiB
[2024-11-20T13:20:38Z INFO ] starting api on http://0.0.0.0:8080 ...
[2024-11-20T13:21:01Z INFO ] starting chat for 127.0.0.1:53486 ...
[2024-11-20T13:21:01Z INFO ] starting the inference loop (mem=13.4 GiB)


 );
. +-O

、,年
(H out grin\) \zellik个岁 into3 �.;
;
 (实1Typ (-],月irable

太 potrzeM\ nack])
 targetType,


;

.]
0)

)


).\Type Years Be-fire.about:型乘#0

5戢元॥$,,,
 ptsT outicensed aided]. lively); '
9 тех月初 qs(x迄V linguistic, statute])

.
: =. Dh như $.mybatisplus

(D
[2024-11-20T13:22:50Z INFO ] 100 tokens generated (1.1842822354409428 token/s) - mem=13.6 GiB
thread 'actix-server worker 0' panicked at cake-core/src/cake/mod.rs:155:9:
not implemented
stack backtrace:
   0:     0x56464bf11d27 - <std::sys::backtrace::BacktraceLock::print::DisplayBacktrace as core::fmt::Display>::fmt::h8ebe18394a4d38c1
   1:     0x56464bbe0efb - core::fmt::write::ha0a58e1b31f3c795
   2:     0x56464bedd15e - std::io::Write::write_fmt::hf44822512e2ddbe5
   3:     0x56464bf0b877 - std::panicking::default_hook::{{closure}}::h7bf7918f31cb7957
   4:     0x56464bf0c790 - std::panicking::rust_panic_with_hook::h3972652105d7c699
   5:     0x56464bf12162 - std::panicking::begin_panic_handler::{{closure}}::h325629d01629f674
   6:     0x56464bf120f9 - std::sys::backtrace::__rust_end_short_backtrace::h533a501939048fce
   7:     0x56464bf0bd64 - rust_begin_unwind
   8:     0x56464b7ae502 - core::panicking::panic_fmt::h54e352f1595c6bc3
   9:     0x56464b7ae5eb - core::panicking::panic::h465b14d5bd548a71
  10:     0x56464ba5be16 - cake_core::cake::Forwarder::goodbye::{{closure}}::h39233fba059f6476
zsh: abort (core dumped)   --model /nix/store/ijvc51znsc5h84y7iasnf3cq9n0zr1wy-Qwen2.5-Coder-7B --api

@doraemoncandy
Copy link

doraemoncandy commented Dec 12, 2024

I also face this problem. have you resolved this? thanks!

@malikwirin
Copy link
Author

I also face this problem. have you resolved this? thanks!

I have not
It seems like the readme fails in mentioning that the tool is only yet compatible with specific models like llama, which I did not try because its license is not compatible with my usecase

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants