Could you please explain the choice of STFT size 512? #27

xdcesc · 2019-05-20T03:04:48Z

@LukasDrude Could you please explain why choosing STFT size 512 (with shift 128)? Is is related to the coherence bandwidth of RIR?

LukasDrude · 2019-05-20T06:06:37Z

We tend to use WPE together with other component, e.g. beamforming. When doing to, we use parameters typical for that application.

In this example [1, 2] we use 512 as a window size. But we tend to check various sizes/ shifts when performance is important.

In [2] we use it together with a beamformer. Since 1024 size and 256 shift worked better on this dataset for beamforming, we used this parameters. Its worth noting, that all other parameters (minimum delay, ...) should ideally be checked, e.g. on the development set.

[1] https://groups.uni-paderborn.de/nt/pubs/2018/IWAENC_2018_Heymann_Paper.pdf
[2] https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8683294
[3] https://groups.uni-paderborn.de/nt/pubs/2018/INTERSPEECH_2018_Drude_Paper.pdf

xdcesc · 2019-05-21T01:15:55Z

@LukasDrude Thanks for your reply. I do some simulations using different echo lengths and DFT sizes. It is true that we need check various DFT sizes to get optimal performance, for example, for 800ms echo, the best DFT window size is 1024. And what confused me is using 2048-point DFT makes it worse. Considering coherent bandwidth of room impulse response, greater DFT window size should not lead to performance degradation.

LukasDrude · 2019-05-21T03:10:13Z

@xdcesc I for sure recommend to not tune the DFT size to each single utterance. We tend to set the parameters on the train or validation set and then keep that value for the test set.

In general, with DFT sizes you have different effects playing in. If your DFT size is very high, you have very few time frames for WPE to calculate the covariance matrix. You have a high frequency resolution, but that does not really help when the algorithm provides inaccurate estimates.

Also keep in mind that when you change DFT size you basically have to tune all other parameters as well (e.g. change minimum delay, ...).

xdcesc closed this as completed May 23, 2019

xdcesc reopened this May 23, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Could you please explain the choice of STFT size 512? #27

Could you please explain the choice of STFT size 512? #27

xdcesc commented May 20, 2019

LukasDrude commented May 20, 2019

xdcesc commented May 21, 2019

LukasDrude commented May 21, 2019

Could you please explain the choice of STFT size 512? #27

Could you please explain the choice of STFT size 512? #27

Comments

xdcesc commented May 20, 2019

LukasDrude commented May 20, 2019

xdcesc commented May 21, 2019

LukasDrude commented May 21, 2019