Skip to main content
placeholder image

Spectral mask estimation using deep neural networks for inter-sensor data ratio model based robust DOA estimation

Conference Paper


Abstract


  • Accurate DOA estimation based on clustering the inter-sensor data ratios (ISDRs) of a single acoustic vector sensor (AVS), referred as AVS-ISDR, relies on reliable extraction of time-frequency points with high local signal-to-noise ratio (HLSNR-TFPs) and its performance degrades in noisy environments. This paper investigates deep neural networks (DNNs) trained with noisy-clean speech pairs under different SNR levels and noise types to improve the performance of AVS-ISDR in noise conditions. The DNNs is trained to learn characteristics reflecting the level of speech information at different TFPs, which helps to generate a reliable spectral mask for obtaining a noise-reduced spectral. Correspondingly, a robust DOA estimation algorithm named as AVS-DNN-ISDR has been developed. Experimental results verify the proposed DNN-based spectral mask improves the reliable HLSNR-TFPs extraction at different SNR levels. Results from simulations and real AVS recordings further validate AVS-DNN-ISDR achieving high DOA estimation accuracy even when the SNR is lower than 0dB.

Authors


  •   Zheng, Weiqiao (external author)
  •   Zou, Yue-Xian (external author)
  •   Ritz, Christian H.

Publication Date


  • 2015

Citation


  • W. Q. Zheng, Y. X. Zou & C. Ritz, "Spectral mask estimation using deep neural networks for inter-sensor data ratio model based robust DOA estimation,"^^ in Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on, 2015, pp. 325-329.

Scopus Eid


  • 2-s2.0-84946053712

Ro Metadata Url


  • http://ro.uow.edu.au/eispapers/5634

Start Page


  • 325

End Page


  • 329

Abstract


  • Accurate DOA estimation based on clustering the inter-sensor data ratios (ISDRs) of a single acoustic vector sensor (AVS), referred as AVS-ISDR, relies on reliable extraction of time-frequency points with high local signal-to-noise ratio (HLSNR-TFPs) and its performance degrades in noisy environments. This paper investigates deep neural networks (DNNs) trained with noisy-clean speech pairs under different SNR levels and noise types to improve the performance of AVS-ISDR in noise conditions. The DNNs is trained to learn characteristics reflecting the level of speech information at different TFPs, which helps to generate a reliable spectral mask for obtaining a noise-reduced spectral. Correspondingly, a robust DOA estimation algorithm named as AVS-DNN-ISDR has been developed. Experimental results verify the proposed DNN-based spectral mask improves the reliable HLSNR-TFPs extraction at different SNR levels. Results from simulations and real AVS recordings further validate AVS-DNN-ISDR achieving high DOA estimation accuracy even when the SNR is lower than 0dB.

Authors


  •   Zheng, Weiqiao (external author)
  •   Zou, Yue-Xian (external author)
  •   Ritz, Christian H.

Publication Date


  • 2015

Citation


  • W. Q. Zheng, Y. X. Zou & C. Ritz, "Spectral mask estimation using deep neural networks for inter-sensor data ratio model based robust DOA estimation,"^^ in Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on, 2015, pp. 325-329.

Scopus Eid


  • 2-s2.0-84946053712

Ro Metadata Url


  • http://ro.uow.edu.au/eispapers/5634

Start Page


  • 325

End Page


  • 329