Question about batching

Hello, I try to re-train the model on GPU and come to some issues with batching and sampling.

When coming to the training loop, the sample has an effective batch size of 1, so each iteration equals one sample? Is it expected behaviour or am I doing something wrong? Is it connected to the custom bucket sampling used in the code? I tried to swap it with Standard sampler from PyTorch and when I tried to make a batch size bigger than 1, constant memory errors persisted. I have 48 GB RAM on a40 GPU. When profiled, it showed that hundreds of GB get swapped back-and-forth in 10 dev-run iterations for GigaSpeech samples.

So, is that a dataset issue, Lightning issue, or sampler issue? 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about batching #6

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Question about batching #6

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions