feature: higher granularity when reporting metrics through ray.tune #77

suzannejin · 2025-02-03T15:33:32Z

Currently, we have:

    def step(self) -> dict:
        """For each batch in the training data, calculate the loss and update the model parameters.

        This calculation is performed based on the model's batch function.
        At the end, return the objective metric(s) for the tuning process.
        """
        for _step_size in range(self.step_size):
            for x, y, _meta in self.training:
                # the loss dict could be unpacked with ** and the function declaration handle it differently like **kwargs. to be decided, personally find this more clean and understable.
                self.model.batch(x=x, y=y, optimizer=self.optimizer, **self.loss_dict)
        return self.objective()

In this case, ray will report the objective metrics each step_size step(s). This means that tuning models can only be killed every step_size step(s) at max.
We could improve granularity by allowing the reporting at each n_batch batches, for example.

The reason behind this is that for some models like LLM, there is no concept of epochs. The huge amount of tokens will only be presented once, to avoid overfitting through training on the same data several epochs.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feature: higher granularity when reporting metrics through ray.tune #77

feature: higher granularity when reporting metrics through ray.tune #77

suzannejin commented Feb 3, 2025 •

edited

Loading

feature: higher granularity when reporting metrics through ray.tune #77

feature: higher granularity when reporting metrics through ray.tune #77

Comments

suzannejin commented Feb 3, 2025 • edited Loading

suzannejin commented Feb 3, 2025 •

edited

Loading