Releases · sbintuitions/flexeval

12 Dec 07:32

h-asano

v0.15.4

024c1a5

v0.15.4 Latest

Latest

What's Changed

fix ROUGE: add workaround for Hypothesis == '.' to avoid ValueError("Hypothesis is empty.") by @SeitaroShinagawa in #269
Reduce disk space usage by @butsugiri in #271
Bump ruff version by @butsugiri in #270
Output reasoning_text by @h-asano in #272

New Contributors

@h-asano made their first contribution in #272

Full Changelog: v0.15.3...v0.15.4

Contributors

butsugiri, SeitaroShinagawa, and h-asano

Assets 2

25 Nov 06:00

junya-takayama

v0.15.3

0879ca5

v0.15.3

What's Changed

Reduce disk space usage of the unit test by @Ktakuya332C in #263
[add] truncation options for ROUGE by @yutojubako in #264
Add an option to specify whether to call language_model.cleanup_resources() by @junya-takayama in #267
Add --retry mode in flexeval_lm by @junya-takayama in #266
Add exit code to flexeval_lm by @moskomule in #265

New Contributors

@yutojubako made their first contribution in #264

Full Changelog: v0.15.2...v0.15.3

Contributors

moskomule, Ktakuya332C, and 2 other contributors

Assets 2

17 Nov 02:00

Ktakuya332C

v0.15.2

9da965a

v0.15.2

What's Changed

Fix an issue triggered by long input to load_jinja2_template by @Ktakuya332C in #260
Add support for returning reasoning_text by @hwichan0720 in #259

New Contributors

@hwichan0720 made their first contribution in #259

Full Changelog: v0.15.1...v0.15.2

Contributors

Ktakuya332C and hwichan0720

Assets 2

14 Nov 02:46

Ktakuya332C

v0.15.1

16dfda7

v0.15.1

What's Changed

Use a dummy API key with VLLMServeLM. by @junya-takayama in #250
Accept path to jinja2 template by @butsugiri in #252
perplexity計算時にbosトークンがない場合にeosトークンを代わりに使う by @Ktakuya332C in #253
Disable OpenAI Test by @butsugiri in #255
Unified interface for setting random seed by @butsugiri in #254
Disable saving cache in CI by @butsugiri in #257
Add --num_repeats to flexeval_lm command by @butsugiri in #256
Add truncate_middle to jinja env by @Ktakuya332C in #258

Full Changelog: v0.15.0...v0.15.1

Contributors

butsugiri, Ktakuya332C, and junya-takayama

Assets 2

22 Sep 07:15

junya-takayama

v0.15.0

34847d9

v0.15.0

What's Changed

Restore raw_lm_output in evaluation outputs by @uehara-mech in #245
Remove __del__ method by @butsugiri in #247
LiteLLMChatAPIのtoolsバグ修正 by @yhetsugi-SBint in #248
OpenAIChatAPIのretryパラメータのexpose by @yhetsugi-SBint in #249
upgrade: vllm==0.10.2 by @junya-takayama in #246

New Contributors

@uehara-mech made their first contribution in #245

Full Changelog: v0.14.3...v0.15.0

Contributors

butsugiri, junya-takayama, and 2 other contributors

Assets 2

09 Sep 00:15

yhetsugi-SBint

v0.14.3

ec043c1

v0.14.3

What's Changed

README を修正 by @teruaki-o in #238
Allow the regex for parsing ChatLLMScore to be configurable. by @kevin3314 in #239
Refactor: Test of LLMScore by @kevin3314 in #240
Make LLMScore's category_key accept list[str] as well. by @kevin3314 in #241
Fanction calling非対応OpenAIモデルのエラー対応 by @yhetsugi-SBint in #242
ChatResponseでFew-shotが挿入されない不具合対応 by @yhetsugi-SBint in #243

New Contributors

@yhetsugi-SBint made their first contribution in #242

Full Changelog: v0.14.2...v0.14.3

Contributors

kevin3314, teruaki-o, and yhetsugi-SBint

Assets 2

03 Sep 01:34

kevin3314

v0.14.2

3c8ac24

v0.14.2

What's Changed

Add a custom JSON encoder to truncate base64 strings by @moskomule in #233
Fix tools in __init__ of LanguageModel by @ryokan0123 in #234
Add the max_parallel_requests param into VLLMServeLM by @junya-takayama in #235
Relax the version requirement of scipy by @ryokan0123 in #236
Allow the regex for parsing LLMScore to be configurable. by @kevin3314 in #237

Full Changelog: v0.14.1...v0.14.2

Contributors

moskomule, ryokan0123, and 2 other contributors

Assets 2

26 Aug 00:03

ryokan0123

v0.14.1

edc4de3

v0.14.1

What's Changed

Improve data handling in json.dumps used for logging by @moskomule in #230
Add a parameter to limit the maximum number of parallel processes to OpenAI*API. by @junya-takayama in #231
Add tools parameter to language models by @ryokan0123 in #232

Full Changelog: v0.14.0...v0.14.1

Contributors

moskomule, ryokan0123, and junya-takayama

Assets 2

20 Aug 08:36

ryokan0123

v0.14.0

15bc20d

v0.14.0

What's Changed

Refactor eval pipeline by @ryokan0123 in #225
Fix typing of tools in TemplateChatDataset by @ryokan0123 in #227
Allow empty gen_kwargs for some EvalSetup classes by @ryokan0123 in #228
Remove tool_call validation in HuggingFaceLM and fix typing by @ryokan0123 in #229

Full Changelog: v0.13.4...v0.14.0

Contributors

ryokan0123

Assets 2

19 Aug 01:26

ryokan0123

v0.13.4

dbba3c6

v0.13.4

What's Changed

Let Metric accepts LMOutput as inputs by @ryokan0123 in #224
Add tools parameter to flexeval/core/chat_dataset/template_based.py by @ryokan0123 in #226

Full Changelog: v0.13.3...v0.13.4

Contributors

ryokan0123

Assets 2

Releases: sbintuitions/flexeval

v0.15.4

What's Changed

New Contributors

Contributors

Uh oh!

v0.15.3

What's Changed

New Contributors

Contributors

Uh oh!

v0.15.2

What's Changed

New Contributors

Contributors

Uh oh!

v0.15.1

What's Changed

Contributors

Uh oh!

v0.15.0

What's Changed

New Contributors

Contributors

Uh oh!

v0.14.3

What's Changed

New Contributors

Contributors

Uh oh!

v0.14.2

What's Changed

Contributors

Uh oh!

v0.14.1

What's Changed

Contributors

Uh oh!

v0.14.0

What's Changed

Contributors

Uh oh!

v0.13.4

What's Changed

Contributors

Uh oh!