Skip to content

BERTScore #3

@forrestbao

Description

@forrestbao

It seems that the default values of keyword arguments in Huggingface's BERTScore API do not give the best of BERTScore.

  1. idf: By default, it is off. We should probably turn it on. See "Importance Weighting" on page 4 of BERTScore paper However, since we use the same setting for both traditional and new approach, I am not sure whether it matters.
  2. model_type: Default language model is roberta-large when lang=en. According to BERTScore's lead board, other models have higher correlation with human ratings. However, since we use the same language model for both traditional/ref-based and new/DocAsRef approach, I am not sure whether it matters.
  3. use_fast_tokenizer. Default is off. Please turn on to speed up. Huggingface's fast tokenizer is implemented in Rust instead of Python.

@NKWBTB @lihebi Let me know your thoughts.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions