You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello, @ruifanxu.
I would like to ask you about the filling of sentences less than the maximum length.
'''
def mask(self, seqs):
device = next(self.parameters()).device
batch_size, seq_len = seqs.shape
mask = torch.triu(torch.ones((seq_len,seq_len), dtype=torch.long), diagonal=1).to(device) # [seq_len ,seq_len]
pad = torch.eq(seqs,self.padding_idx) # [n, seq_len]
mask = torch.where(pad[:,None,None,:],1,mask[None,None,:,:]).to(device) # [n, 1, seq_len, seq_len]
return mask>0 # [n, 1, seq_len, seq_len]
'''
Suppose the longest sentence is 200.
My idea is that sentences with less than 200 words should be given zero padding so that it will not affect the attention calculation and the generation of the target sequence.
Therefore, if there are only three words in a sentence, the following 197 (step) should be 0 (zero padding), otherwise the following parameters (context= [n, step, model_dim]) will have model_dim without step .
However, the above program does not seem to be filled with zeros? Only the future mask. I want to ask if I have an understanding error? Thank you.
Looking forward to your reply!Thank you
The text was updated successfully, but these errors were encountered:
The data pass into the function has padded already, so those sentences less than max length will be padded with 0.
Before constructing the mask, pad = torch.eq(seqs, self.padding_idx) will return a boolean matrix where each place indicates if it is a pad.
Then use this variable to construct the final mask, mask = torch.where(pad[:,None,None,:],1,mask[None,None,:,:]).to(device) # [n, 1, seq_len, seq_len]
The above code will check if the each element in the pad variable and return 1 if true, otherwise it will fill the final mask with the value in the same position of mask variable.
Hence, it is not just the future mask but also masks the place where it is a pad. Hope this answers your question.
Hello, @ruifanxu.
I would like to ask you about the filling of sentences less than the maximum length.
'''
def mask(self, seqs):
device = next(self.parameters()).device
batch_size, seq_len = seqs.shape
mask = torch.triu(torch.ones((seq_len,seq_len), dtype=torch.long), diagonal=1).to(device) # [seq_len ,seq_len]
pad = torch.eq(seqs,self.padding_idx) # [n, seq_len]
mask = torch.where(pad[:,None,None,:],1,mask[None,None,:,:]).to(device) # [n, 1, seq_len, seq_len]
return mask>0 # [n, 1, seq_len, seq_len]
'''
Suppose the longest sentence is 200.
My idea is that sentences with less than 200 words should be given zero padding so that it will not affect the attention calculation and the generation of the target sequence.
Therefore, if there are only three words in a sentence, the following 197 (step) should be 0 (zero padding), otherwise the following parameters (context= [n, step, model_dim]) will have model_dim without step .
However, the above program does not seem to be filled with zeros? Only the future mask. I want to ask if I have an understanding error? Thank you.
Looking forward to your reply!Thank you
The text was updated successfully, but these errors were encountered: