-
Notifications
You must be signed in to change notification settings - Fork 5
Description
Hi, thank you for the great work! We have reproduced the MMQA results and they match the paper exactly. However, when reproducing the WebQA CLIP retrieval results, we found a potential typo in the reported R@5 value.
We evaluated using the provided pre-built FAISS index and CLIP-ViT-L/14-336px, running utils/indexing_faiss.py with --datasets WebQA --clip_type clip --topk {2,5,10}.
Our results:
Metric | Paper | Ours
R@2 | 57.10 | 57.04
R@5 | 71.96 | 75.96
R@10 | 84.86 | 84.84
R@2 and R@10 match within 0.06%, which is within expected floating-point precision. However, R@5 differs by 4.0 points. Since the retrieval is fully deterministic (same model, same pre-built IndexFlatIP index), this gap is unexpected. We noticed that 71.96 and 75.96 differ by only a single digit (1 vs 5), which looks like a typo. Could you confirm the correct R@5 value?