Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature: higher granularity when reporting metrics through ray.tune #77

Open
suzannejin opened this issue Feb 3, 2025 · 0 comments
Open

Comments

@suzannejin
Copy link
Collaborator

suzannejin commented Feb 3, 2025

Currently, we have:

    def step(self) -> dict:
        """For each batch in the training data, calculate the loss and update the model parameters.

        This calculation is performed based on the model's batch function.
        At the end, return the objective metric(s) for the tuning process.
        """
        for _step_size in range(self.step_size):
            for x, y, _meta in self.training:
                # the loss dict could be unpacked with ** and the function declaration handle it differently like **kwargs. to be decided, personally find this more clean and understable.
                self.model.batch(x=x, y=y, optimizer=self.optimizer, **self.loss_dict)
        return self.objective()

In this case, ray will report the objective metrics each step_size step(s). This means that tuning models can only be killed every step_size step(s) at max.
We could improve granularity by allowing the reporting at each n_batch batches, for example.

The reason behind this is that for some models like LLM, there is no concept of epochs. The huge amount of tokens will only be presented once, to avoid overfitting through training on the same data several epochs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant