Releasing evaluation log probabilites #3

tholiao · 2022-05-12T22:34:14Z

Hi, thanks for open-sourcing model code! Could you release the log probabilities for evaluation tasks (i.e., the model probabilities for valid answers for each prompt on each question for all evaluated datasets)? This data would allow for for fine-grained evaluation of models and comparing against other LLMs.

cf. facebookresearch/metaseq#25

HaokunLiu · 2022-05-19T00:58:29Z

Oops, I don't think our code saved the individual probabilities. But I will keep it in mind when designing future project codebases. If you need them, would you mind modifying the validation_epoch_end function in src.models.EncoderDecoder.py to save them to a file and then running fine-tuning experiments from your side?

HaokunLiu mentioned this issue Oct 6, 2022

save dev_pred.txt and test_pred.txt for RTE and ANLI #22

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Releasing evaluation log probabilites #3

Releasing evaluation log probabilites #3

tholiao commented May 12, 2022 •

edited

Loading

HaokunLiu commented May 19, 2022 •

edited

Loading

Releasing evaluation log probabilites #3

Releasing evaluation log probabilites #3

Comments

tholiao commented May 12, 2022 • edited Loading

HaokunLiu commented May 19, 2022 • edited Loading

tholiao commented May 12, 2022 •

edited

Loading

HaokunLiu commented May 19, 2022 •

edited

Loading