File tree 1 file changed +3
-3
lines changed
1 file changed +3
-3
lines changed Original file line number Diff line number Diff line change @@ -22,9 +22,9 @@ So it's combining the best of RNN and transformer - great performance, fast infe
22
22
| Paper | 🎓[ Paper Accepted @ EMNLP 2023] ( https://arxiv.org/abs/2305.13048 ) | (no architecture change) | wip | wip |
23
23
| Overall Status | 🌚 EOL - Recommended to use v5 world instead | ✅ GA - Recommended to switch to v5 world when possible | 🔧 Training | 🧪 Prototyping |
24
24
| 0.4B model | [ Fully Trained : rwkv-pile-430m] ( https://huggingface.co/RWKV/rwkv-4-430m-pile ) | ✅ [ Fully Trained] ( https://huggingface.co/RWKV/rwkv-4-world-430m ) | ✅ [ Fully Trained] ( https://huggingface.co/BlinkDL/rwkv-5-world/blob/main/RWKV-5-World-0.4B-v2-20231113-ctx4096.pth ) | 🧪Prototyping |
25
- | 1.5B model | [ Fully Trained : rwkv-raven-1b5] ( https://huggingface.co/RWKV/rwkv-raven-1b5 ) | ✅[ Fully Trained] ( https://huggingface.co/RWKV/rwkv-4-world-1b5 ) | ✅[ Fully Trained] ( https://huggingface.co/BlinkDL/rwkv-5-world/blob/main/RWKV-5-World-1B5-v2-20231025-ctx4096.pth ) | 🧪Prototyping |
26
- | 3B model | [ Fully Trained : rwkv-raven-3b] ( https://huggingface.co/RWKV/rwkv-raven-3b ) | ✅[ Fully Trained] ( https://huggingface.co/RWKV/rwkv-4-world-3b ) | 🔧[ Finalizing] ( https://huggingface.co/BlinkDL/rwkv-5-world/blob/main/RWKV-5-World-3B-v2-20231118-ctx16k.pth ) | 🧪Prototyping |
27
- | 7B model | [ Fully Trained : rwkv-raven-7b] ( https://huggingface.co/RWKV/rwkv-raven-7b ) | ✅[ Fully Trained] ( https://huggingface.co/RWKV/rwkv-4-world-7b ) | 🔧[ Training in process] ( https://huggingface.co/BlinkDL/temp/blob/main/rwkv-x052-7b-world-v2-79%25trained-20231208-ctx4k.pth ) | |
25
+ | 1.5B model | [ Fully Trained : rwkv-raven-1b5] ( https://huggingface.co/RWKV/rwkv-raven-1b5 ) | ✅ [ Fully Trained] ( https://huggingface.co/RWKV/rwkv-4-world-1b5 ) | ✅ [ Fully Trained] ( https://huggingface.co/BlinkDL/rwkv-5-world/blob/main/RWKV-5-World-1B5-v2-20231025-ctx4096.pth ) | 🧪Prototyping |
26
+ | 3B model | [ Fully Trained : rwkv-raven-3b] ( https://huggingface.co/RWKV/rwkv-raven-3b ) | ✅ [ Fully Trained] ( https://huggingface.co/RWKV/rwkv-4-world-3b ) | 🔧 [ Finalizing] ( https://huggingface.co/BlinkDL/rwkv-5-world/blob/main/RWKV-5-World-3B-v2-20231118-ctx16k.pth ) | 🧪Prototyping |
27
+ | 7B model | [ Fully Trained : rwkv-raven-7b] ( https://huggingface.co/RWKV/rwkv-raven-7b ) | ✅ [ Fully Trained] ( https://huggingface.co/RWKV/rwkv-4-world-7b ) | 🔧 [ Training in process] ( https://huggingface.co/BlinkDL/temp/blob/main/rwkv-x052-7b-world-v2-79%25trained-20231208-ctx4k.pth ) | |
28
28
| 14B model | [ Fully Trained : rwkv-raven-14b] ( https://huggingface.co/RWKV/rwkv-raven-14b ) | not-planned | scheduled | |
29
29
30
30
# TLDR vs Existing transformer models
You can’t perform that action at this time.
0 commit comments