-
Notifications
You must be signed in to change notification settings - Fork 66
Open
Description
SpeechTokenizer/speechtokenizer/quantization/core_vq.py
Lines 139 to 149 in 30c96fb
| def init_embed_(self, data): | |
| if self.inited: | |
| return | |
| embed, cluster_size = kmeans(data, self.codebook_size, self.kmeans_iters) | |
| self.embed.data.copy_(embed) | |
| self.embed_avg.data.copy_(embed.clone()) | |
| self.cluster_size.data.copy_(cluster_size) | |
| self.inited.data.copy_(torch.Tensor([True])) | |
| # Make sure all buffers across workers are in sync after initialization | |
| #broadcast_tensors(self.buffers()) |
In core_vq.py, broadcasting tensor function is commented, which is different from the original code facebookresearch/encodec
According to the original author of encodec, this broadcasting seems to required for multi-gpu training.
Have you been tested and compared the encodec model trained w/ or w/o broadcasting function?
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels