Hi,
Thanks for your work! I have been trying to replicate the reported results. Since the model training has been taking too long, I wanted to explore what might be the reason. While debugging the code I found something that was unexpected. Although the term is present in the last ISMIR paper, the aggregated score_{\eps} from the no-event sections seems not used in the computation. The comment says it is 'dummy eps score for testing', but the training actually only considers the positive region scores for backpropagation (which might be harder to optimize for). Was this the code that was used during your training? I thought this might explain why it is taking this long to replicate. This part is actually used in both evalPathSlow and the actual evalPath function in training. Am I missing something? Is training faster without nullifying the score_eps? I would appreciate if you can kindly explain the code in comparison to the paper formulation.
Thank you!
Hi,
Thanks for your work! I have been trying to replicate the reported results. Since the model training has been taking too long, I wanted to explore what might be the reason. While debugging the code I found something that was unexpected. Although the term is present in the last ISMIR paper, the aggregated score_{\eps} from the no-event sections seems not used in the computation. The comment says it is 'dummy eps score for testing', but the training actually only considers the positive region scores for backpropagation (which might be harder to optimize for). Was this the code that was used during your training? I thought this might explain why it is taking this long to replicate. This part is actually used in both evalPathSlow and the actual evalPath function in training. Am I missing something? Is training faster without nullifying the score_eps? I would appreciate if you can kindly explain the code in comparison to the paper formulation.
Thank you!