dynamic scoring for LM #188

l-k-11235 · 2025-01-17T09:53:34Z

This PRs adapts the ScoringPreparator for LM architecture and fixes the ignore_prompt method of the loss for LM validation with left padding.
We noticed that the filtertoolong transform should not be used along with the huggingface_tokenize transform. See
#191

…ators

eole/predict/inference.py

vince62s · 2025-01-22T11:11:38Z

eole/transforms/tokenize_id.py

@@ -92,6 +92,9 @@ def tokenize_string(self, string, side="src", is_train=False):
            kwargs = {"max_length": self.max_length, "truncation": True}
        else:
            kwargs = {}
+        string = string.replace(DefaultTokens.SEP, "\n").replace(
+            DefaultTokens.MASK_BEFORE, self.tokenizers[side].pad_token
+        )


shouldn't we handle it the same way for other tokenizers ?

I have only tested for that one (with eurollm) at this point; an error will be raised with the others.

eole/utils/loss.py

eole/utils/scoring_utils.py

francoishernandez

It seems a bit weird to have such very specific changes in scoring_utils (response_patterns handling mostly), but I'm not sure how this can be properly factorized with the current structure. Let's keep it for now and we might reconsider later.

francoishernandez · 2025-01-28T10:54:33Z

eole/utils/scoring_utils.py

+        if is_seq2seq:
+            predictor = Translator.from_config(  # we need to review opt/config stuff in translator
+                model,
+                self.vocabs,
+                predict_config,
+                model_config,
+                device_id=gpu_rank,
+                global_scorer=scorer,
+                report_align=predict_config.report_align,
+                report_score=False,
+                logger=None,
+            )
+        else:
+            predictor = GeneratorLM.from_config(
+                model,
+                self.vocabs,
+                predict_config,
+                model_config,
+                device_id=gpu_rank,
+                global_scorer=scorer,
+                report_align=predict_config.report_align,
+                report_score=False,
+                logger=None,
+            )


Maybe cleaner to just define a predictor_class in the condition, and call predictor_class(*) once, since they should have the same signature.

l-k-11235 added 5 commits January 17, 2025 10:48

dynamic scoring for LM

9c51d9d

transform insert_mask_placeholder do not insert tgt in inference iter…

47dbd8c

…ators

merged main to get the fix for autoawq

769b9b8

files reformatted

b9fa354

some code cleaning

45e563e