Releases: sbintuitions/flexeval
Releases · sbintuitions/flexeval
v0.15.4
What's Changed
- fix ROUGE: add workaround for Hypothesis == '.' to avoid ValueError("Hypothesis is empty.") by @SeitaroShinagawa in #269
- Reduce disk space usage by @butsugiri in #271
- Bump ruff version by @butsugiri in #270
- Output reasoning_text by @h-asano in #272
New Contributors
Full Changelog: v0.15.3...v0.15.4
v0.15.3
What's Changed
- Reduce disk space usage of the unit test by @Ktakuya332C in #263
- [add] truncation options for ROUGE by @yutojubako in #264
- Add an option to specify whether to call
language_model.cleanup_resources()by @junya-takayama in #267 - Add
--retrymode inflexeval_lmby @junya-takayama in #266 - Add exit code to flexeval_lm by @moskomule in #265
New Contributors
- @yutojubako made their first contribution in #264
Full Changelog: v0.15.2...v0.15.3
v0.15.2
What's Changed
- Fix an issue triggered by long input to
load_jinja2_templateby @Ktakuya332C in #260 - Add support for returning reasoning_text by @hwichan0720 in #259
New Contributors
- @hwichan0720 made their first contribution in #259
Full Changelog: v0.15.1...v0.15.2
v0.15.1
What's Changed
- Use a dummy API key with
VLLMServeLM. by @junya-takayama in #250 - Accept path to jinja2 template by @butsugiri in #252
- perplexity計算時にbosトークンがない場合にeosトークンを代わりに使う by @Ktakuya332C in #253
- Disable OpenAI Test by @butsugiri in #255
- Unified interface for setting random seed by @butsugiri in #254
- Disable saving cache in CI by @butsugiri in #257
- Add
--num_repeatstoflexeval_lmcommand by @butsugiri in #256 - Add truncate_middle to jinja env by @Ktakuya332C in #258
Full Changelog: v0.15.0...v0.15.1
v0.15.0
What's Changed
- Restore raw_lm_output in evaluation outputs by @uehara-mech in #245
- Remove
__del__method by @butsugiri in #247 - LiteLLMChatAPIのtoolsバグ修正 by @yhetsugi-SBint in #248
- OpenAIChatAPIのretryパラメータのexpose by @yhetsugi-SBint in #249
- upgrade:
vllm==0.10.2by @junya-takayama in #246
New Contributors
- @uehara-mech made their first contribution in #245
Full Changelog: v0.14.3...v0.15.0
v0.14.3
What's Changed
- README を修正 by @teruaki-o in #238
- Allow the regex for parsing ChatLLMScore to be configurable. by @kevin3314 in #239
- Refactor: Test of LLMScore by @kevin3314 in #240
- Make
LLMScore'scategory_keyacceptlist[str]as well. by @kevin3314 in #241 - Fanction calling非対応OpenAIモデルのエラー対応 by @yhetsugi-SBint in #242
- ChatResponseでFew-shotが挿入されない不具合対応 by @yhetsugi-SBint in #243
New Contributors
- @yhetsugi-SBint made their first contribution in #242
Full Changelog: v0.14.2...v0.14.3
v0.14.2
What's Changed
- Add a custom JSON encoder to truncate base64 strings by @moskomule in #233
- Fix
toolsin__init__ofLanguageModelby @ryokan0123 in #234 - Add the
max_parallel_requestsparam intoVLLMServeLMby @junya-takayama in #235 - Relax the version requirement of
scipyby @ryokan0123 in #236 - Allow the regex for parsing LLMScore to be configurable. by @kevin3314 in #237
Full Changelog: v0.14.1...v0.14.2
v0.14.1
What's Changed
- Improve data handling in json.dumps used for logging by @moskomule in #230
- Add a parameter to limit the maximum number of parallel processes to
OpenAI*API. by @junya-takayama in #231 - Add
toolsparameter to language models by @ryokan0123 in #232
Full Changelog: v0.14.0...v0.14.1
v0.14.0
What's Changed
- Refactor eval pipeline by @ryokan0123 in #225
- Fix typing of
toolsinTemplateChatDatasetby @ryokan0123 in #227 - Allow empty
gen_kwargsfor someEvalSetupclasses by @ryokan0123 in #228 - Remove tool_call validation in
HuggingFaceLMand fix typing by @ryokan0123 in #229
Full Changelog: v0.13.4...v0.14.0
v0.13.4
What's Changed
- Let
MetricacceptsLMOutputas inputs by @ryokan0123 in #224 - Add
toolsparameter toflexeval/core/chat_dataset/template_based.pyby @ryokan0123 in #226
Full Changelog: v0.13.3...v0.13.4