Releases: josStorer/RWKV-Runner
v1.8.0
Changes
- bump webgpu mode ai00_server v0.4.3
- fix remote customApiUrl (403 forbidden error)
- update state-tuned safetensors converter
Install
- Windows: https://github.com/josStorer/RWKV-Runner/blob/master/build/windows/Readme_Install.txt
- MacOS: https://github.com/josStorer/RWKV-Runner/blob/master/build/darwin/Readme_Install.txt
- Linux: https://github.com/josStorer/RWKV-Runner/blob/master/build/linux/Readme_Install.txt
- Simple Deploy Example: https://github.com/josStorer/RWKV-Runner/blob/master/README.md#simple-deploy-example
- Server Deploy Examples: https://github.com/josStorer/RWKV-Runner/tree/master/deploy-examples
v1.7.9
Changes
- bump webgpu mode ai00_server v0.4.2 (huge performance improvement)
- upgrade to rwkv 0.8.26 (state-tuned model support)
- update defaultConfigs and manifest.json
- chores
Breaking Changes
- change the default value of
presystem
to false
For the convenience of using the future state-tuned models, the default value of presystem
has been set to false. This means that the RWKV-Runner service will no longer automatically insert recommended RWKV pre-prompts for you:
User: hi
Assistant: Hi. I am your assistant and I will provide expert full response in full details. Please feel free to ask any question and I will always answer it.
If you are using the API service and conducting a rigorous RWKV conversation, please manually send the above messages to the /chat/completions
API's messages
array, or manually send presystem: true
to have the server automatically insert pre-prompts.
If you are using the RWKV-Runner client for chatting, you can enable Insert default system prompt at the beginning
in the preset editor.
Of course, in reality, even if you do not perform the above, there is usually no significant negative impact.
If you are using the new RWKV state-tuned models, you do not need to perform the above.
The new RWKV state-tuned models can be downloaded here, they are very interesting:
- https://huggingface.co/BlinkDL/rwkv-6-state-instruct-aligned
- https://huggingface.co/BlinkDL/temp-latest-training-models
If you are interested in state-tuning, please refer to: https://github.com/BlinkDL/RWKV-LM#state-tuning-tuning-the-initial-state-zero-inference-overhead
Install
- Windows: https://github.com/josStorer/RWKV-Runner/blob/master/build/windows/Readme_Install.txt
- MacOS: https://github.com/josStorer/RWKV-Runner/blob/master/build/darwin/Readme_Install.txt
- Linux: https://github.com/josStorer/RWKV-Runner/blob/master/build/linux/Readme_Install.txt
- Simple Deploy Example: https://github.com/josStorer/RWKV-Runner/blob/master/README.md#simple-deploy-example
- Server Deploy Examples: https://github.com/josStorer/RWKV-Runner/tree/master/deploy-examples
v1.7.8
Changes
- bump webgpu mode (https://github.com/Ai00-X/ai00_server) (#321)
Install
- Windows: https://github.com/josStorer/RWKV-Runner/blob/master/build/windows/Readme_Install.txt
- MacOS: https://github.com/josStorer/RWKV-Runner/blob/master/build/darwin/Readme_Install.txt
- Linux: https://github.com/josStorer/RWKV-Runner/blob/master/build/linux/Readme_Install.txt
- Simple Deploy Example: https://github.com/josStorer/RWKV-Runner/blob/master/README.md#simple-deploy-example
- Server Deploy Examples: https://github.com/josStorer/RWKV-Runner/tree/master/deploy-examples
v1.7.7
Changes
- avoid program lag caused by frequent triggering of read/write operations due to Linux file system notification
- improve styles
Install
- Windows: https://github.com/josStorer/RWKV-Runner/blob/master/build/windows/Readme_Install.txt
- MacOS: https://github.com/josStorer/RWKV-Runner/blob/master/build/darwin/Readme_Install.txt
- Linux: https://github.com/josStorer/RWKV-Runner/blob/master/build/linux/Readme_Install.txt
- Simple Deploy Example: https://github.com/josStorer/RWKV-Runner/blob/master/README.md#simple-deploy-example
- Server Deploy Examples: https://github.com/josStorer/RWKV-Runner/tree/master/deploy-examples
v1.7.6
Changes
Features
- make gate and out trainable (JL-er/RWKV-PEFT@834aea0)
- new chat template for /chat/completions api (better system support)
- add system role support for preset
- proxied fetch support (for custom api url)
Improvements
- improve preset editor
- better compatibility for custom api (ollama etc.)
- throttling saveConfigs
- improve error messages
- other details
Install
- Windows: https://github.com/josStorer/RWKV-Runner/blob/master/build/windows/Readme_Install.txt
- MacOS: https://github.com/josStorer/RWKV-Runner/blob/master/build/darwin/Readme_Install.txt
- Linux: https://github.com/josStorer/RWKV-Runner/blob/master/build/linux/Readme_Install.txt
- Simple Deploy Example: https://github.com/josStorer/RWKV-Runner/blob/master/README.md#simple-deploy-example
- Server Deploy Examples: https://github.com/josStorer/RWKV-Runner/tree/master/deploy-examples
v1.7.5
Changes
Fixes
- fix v6 lora (JL-er/RWKV-PEFT@c03cdbb)
Install
- Windows: https://github.com/josStorer/RWKV-Runner/blob/master/build/windows/Readme_Install.txt
- MacOS: https://github.com/josStorer/RWKV-Runner/blob/master/build/darwin/Readme_Install.txt
- Linux: https://github.com/josStorer/RWKV-Runner/blob/master/build/linux/Readme_Install.txt
- Simple Deploy Example: https://github.com/josStorer/RWKV-Runner/blob/master/README.md#simple-deploy-example
- Server Deploy Examples: https://github.com/josStorer/RWKV-Runner/tree/master/deploy-examples
v1.7.4
Changes
Features
- rwkv6 lora finetune support (https://github.com/JL-er/RWKV-LORA)
- latex support
Improvements
- improve markdown rendering
- improve theme
- improve usability
- for Chinese users, replace Tsinghua pip mirrors with Alibaba Cloud to avoid 403 http error
Install
- Windows: https://github.com/josStorer/RWKV-Runner/blob/master/build/windows/Readme_Install.txt
- MacOS: https://github.com/josStorer/RWKV-Runner/blob/master/build/darwin/Readme_Install.txt
- Linux: https://github.com/josStorer/RWKV-Runner/blob/master/build/linux/Readme_Install.txt
- Simple Deploy Example: https://github.com/josStorer/RWKV-Runner/blob/master/README.md#simple-deploy-example
- Server Deploy Examples: https://github.com/josStorer/RWKV-Runner/tree/master/deploy-examples
v1.7.3
Changes
Features
- add Docker support (#291) @LonghronShen
Fixes
- fix a generation exception caused by potentially dangerous regex being passed into the stop array
- fix max_tokens parameter of Chat page not being passed to backend
- fix the issue where penalty_decay and global_penalty are not being passed to the backend default config when running the model through client
Improvements
- prevent 'torch' has no attribute 'cuda' error in torch_gc, so user can use CPU or WebGPU (#302)
Chores
- bump dependencies
- add pre-release workflow
- dep_check.py now ignores GPUtil
Install
- Windows: https://github.com/josStorer/RWKV-Runner/blob/master/build/windows/Readme_Install.txt
- MacOS: https://github.com/josStorer/RWKV-Runner/blob/master/build/darwin/Readme_Install.txt
- Linux: https://github.com/josStorer/RWKV-Runner/blob/master/build/linux/Readme_Install.txt
- Simple Deploy Example: https://github.com/josStorer/RWKV-Runner/blob/master/README.md#simple-deploy-example
- Server Deploy Examples: https://github.com/josStorer/RWKV-Runner/tree/master/deploy-examples
v1.7.2
Changes
Features
- allow setting tokenChunkSize of WebGPU mode
- expose global_penalty
Improvements
- improve parameters controllable range
Chores
- update defaultModelConfigs
Install
- Windows: https://github.com/josStorer/RWKV-Runner/blob/master/build/windows/Readme_Install.txt
- MacOS: https://github.com/josStorer/RWKV-Runner/blob/master/build/darwin/Readme_Install.txt
- Linux: https://github.com/josStorer/RWKV-Runner/blob/master/build/linux/Readme_Install.txt
- Simple Deploy Example: https://github.com/josStorer/RWKV-Runner/blob/master/README.md#simple-deploy-example
- Server Deploy Examples: https://github.com/josStorer/RWKV-Runner/tree/master/deploy-examples
v1.7.1
Changes
This version includes important bug fixes, it is strongly recommended to upgrade to this version.
Upgrades
- webgpu 0.3.20 https://github.com/cgisky1980/ai00_rwkv_server
Features
- allow setting quantizedLayers of WebGPU mode
Improvements
- improve occurrence[token] condition
- disable AVOID_PENALTY_TOKENS when generating (still enabled when preprocessing)
- enable useHfMirror by default for chinese users
Fixes
- fix the issue where state cache could be modified leading to inconsistent hit results
- fix convert_safetensors.py for rwkv6
- add python3-dev to lora fine-tune dependencies (this may previously lead to the error of v5 fine-tune)
Chores
- hide MPS and CUDA-Beta Options
- update manifest
Install
- Windows: https://github.com/josStorer/RWKV-Runner/blob/master/build/windows/Readme_Install.txt
- MacOS: https://github.com/josStorer/RWKV-Runner/blob/master/build/darwin/Readme_Install.txt
- Linux: https://github.com/josStorer/RWKV-Runner/blob/master/build/linux/Readme_Install.txt
- Simple Deploy Example: https://github.com/josStorer/RWKV-Runner/blob/master/README.md#simple-deploy-example
- Server Deploy Examples: https://github.com/josStorer/RWKV-Runner/tree/master/deploy-examples