Skip to content

Conversation

@realgump
Copy link
Contributor

  1. 进行多轮评估的时候,从第二轮开始,qid已经在prefer_dict里面了,直接进入elif会导致没有"round_1"而报错;
  2. 直接进行pass rate比对的时候,reference_modeloutput_model已经+1了,但没有标记为complete。这样从文件读取再次评估的时候,似乎会重复计算这部分。

When 'args.evaluate_times' is greater than 1, during the second evaluation, the key 'qid' is initialized, but 'round_i' is not. This leads to a KeyError.
The comparison result has been updated, but the completion status has not. As a result, when loading the JSON file, the 'qids' from this section will be counted repeatedly.
@skzhang1
Copy link

skzhang1 commented Jul 9, 2024

Good PR!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants