Releases: ModelEngine-Group/unified-cache-management
Releases · ModelEngine-Group/unified-cache-management
v0.1.0rc4
What's Changed
- [feat] ucmtrans: Unify API for Device-Host Memory Transfers by @mag1c-h in #379
- [feat] Add support for Ascend device memory transfers by @mag1c-h in #382
- [Fix] fix build, fix no save kv layer by @Lijiachen1018 in #390
- [feat] Add
pcstorefor enhanced PrefixCache performance by @FangRun2 in #393 - [fix] fix ascend attention by @Lijiachen1018 in #394
- release v0.1.0rc3 by @Lijiachen1018 in #395
- [fix] fix sparse attention by @Lijiachen1018 in #397
New Contributors
Full Changelog: v0.1.0rc2...v0.1.0rc4
v0.1.0rc2
What's Changed
- [docs] update docs for v0.1.0rc1 by @Lijiachen1018 in #365
- [bug fix] Dev patch fix for sparse by @Lijiachen1018 in #371
- [build] auto patch for ascend by @Lijiachen1018 in #372
- feat: add Mthreads MUSA device support -stage 1 by @superleo in #370
- release v0.1.0rc2 by @Lijiachen1018 in #373
- prefetch bug by @zbb200819 in #360
- [Feat]Adapt to vllm-ascend0.9.1 and vllm-ascend0.11.0 by @hero0307 in #362
- [bugfix] add cmake option to bypass NUMA binding by @Clarence-1103 in #368
- [Feat] Update the data items saved by trace replay by @sumingZero in #366
New Contributors
Full Changelog: v0.1.0rc1...v0.1.0rc2
v0.1.0rc1
Support Features
- Prefix Cache
- Sparse Attention
- Sparse Attention Offload
- PD Disaggregation
What's Changed
- remove impl by @flesher0813 in #11
- adapt vllm v0.9.2 by @flesher0813 in #13
- [Doc] Outline of the document by @ygwpz in #15
- remove impl test and add uc connector test by @flesher0813 in #14
- [Doc] Installation of ucm by @flesher0813 in #17
- [Feature] Add DRAM Connector for uc_connector by @harrisonyhq in #18
- [doc] add readme and license by @ygwpz in #24
- [Feature] Add Dockerfiles by @flesher0813 in #20
- [Feature]Nfsstore by @propanone1006 in #23
- [doc] change docs outline by @ygwpz in #32
- [Feature] Add Cmake build command in setup.py by @harrisonyhq in #34
- [fixbug] fix issue#25 issue#31 and issue#33 by @flesher0813 in #30
- [Fix][Docs] Make example runnable and add performance data (closes #37 #29 #42) by @harrisonyhq in #41
- [Feat] Move kv_block_size to config by @harrisonyhq in #43
- [feature][docs]finish nfs store and add docs by @qyh111 in #44
- [doc] Add export of device type in installation;[Fix] fix version invalid#45 #46 by @harrisonyhq in #47
- add perf data in readme by @ygwpz in #49
- [Feat] Merge 0.0.1 back into develop by @flesher0813 in #50
- [bugfix] fix issue#26 and issue#36 by @ygwpz in #55
- [Doc] Add vllm institution by @flesher0813 in #61
- [CI][Fix] update issue and pr template, fix issue #57, cherry-pick main by @flesher0813 in #65
- [Doc] update install doc using patch to build from source code by @flesher0813 in #68
- [Feat] Merge 0.0.1 back into develop by @ygwpz in #72
- [Style] Fix codestyle problems and typo in develop by @harrisonyhq in #75
- [Feature] add ucm_sparse v1.0: unified sparse attention algorithm framework by @hek14 in #79
- [Fix] Fix cant find cmake error when using pip install -e . by @harrisonyhq in #80
- Revert "[Feature] add ucm_sparse v1.0: unified sparse attention algorithm framework " by @ygwpz in #82
- [Feature] add Mooncake Store by @propanone1006 in #86
- [Fix bug] Simplify docker build and installation.md by @flesher0813 in #87
- [BUG]adapt deepseek by @qyh111 in #89
- [Feature][P/D] add example for disaggregated prefill by @flesher0813 in #90
- [Perf] Pipelined ucmnfsstore by @mag1c-h in #97
- Revert "[Feature] add Mooncake Store" by @ygwpz in #98
- [Fix bug] fix uc_connector ut and change hash generation method by @hero0307 in #101
- [Fix] Fix .so build error by @harrisonyhq in #104
- [Fix] Fix ascend compile error by @mag1c-h in #106
- [Perf]Modify start_load_kv by @qyh111 in #103
- [Fix] Fix duplicate create/commit errors upon preemption by @flesher0813 in #109
- [Feat] Adapt for vllm 0.9.1 by @sumingZero in #113
- [Feature] [Doc] UCMSparse framework by @hek14 in #112
- [fix] remove redundant code and files/rename file names by @NaganooMei in #118
- [Fix] Fix spelling issues with PR templates by @propanone1006 in #119
- remove load_tasks by @NaganooMei in #121
- [bugfix] bugfix in ucmnfsstore by @mag1c-h in #123
- [doc]Add config parameter by @UESTC-AHao in #130
- [bugfix]fix rank handing in multi-node pp setup by @qyh111 in #129
- [Feat]Support UCM Sparse on cuda by @harrisonyhq in #126
- [Feature] Add mooncake store by @hufumans in #117
- [bugfix]modify mla dump by @zhou-haitao in #128
- [feature] non-blocking interfaces are provided to check whether the transmission task is completed by @mag1c-h in #139
- [feature] return error if block exists while batch creation. by @mag1c-h in #138
- [feature]modify create interface by @hufumans in #145
- [Doc] change logo and rearange docs by @flesher0813 in #156
- 0.0.2 release merge develop by @ygwpz in #158
- [doc][feature] change code directory by @ygwpz in #161
- [fix] modify patch and workflow by @NaganooMei in #163
- [Feat] Support load async by @flesher0813 in #166
- [Feat]Support load async and load failure by @flesher0813 in #165
- [Feature]refactor ucconnector by @qyh111 in #167
- [feature] upload retake codes by @truthstriver in #172
- [bugfix]Resolve the issue of the first-round commit failure under dsv2 by @zhou-haitao in #186
- [Feat] Add KVComp sparse attention implementation in UCM by @leideng in #182
- [perf]prepare offset in advance by @qyh111 in #188
- [feature] GSA by @HaoLi980405 in #190
- [bugfix]fix pp problem and remove err logs when duplicate create by @qyh111 in #191
- [Fix] Fix bug: check task returns -50005 during async load by @sumingZero in #192
- [bugfix]gsa fix reslotmapping bug by @HaoLi980405 in #194
- [bugfix]gsa fix running reqs exceed 30 bug by @HaoLi980405 in #195
- [doc] design doc directory by @ygwpz in #197
- [Perf]kv_block_size as well as transferIoSize are calculated rather than configured by @UESTC-AHao in #196
- [Feat] add cuda topk and gsa descriptions by @HaoLi980405 in #198
- [Fix] Fix workflow image space error in action by @harrisonyhq in #203
- [bugfix]roll back dataoffset by @qyh111 in #201
- [bugfix] fix whl install gsa error and gsa kpre reslotmapping out of range by @HaoLi980405 in #204
- [Fix][Doc] Modify sparse docs by @flesher0813 in https://gi...