Skip to main content
placeholder image

Robust speaker DOA estimation based on the inter-sensor data ratio model and binary mask estimation in the bispectrum domain

Conference Paper


Abstract


  • When noise is directional instead of diffuse, the majority of conventional direction of arrival (DOA) estimation techniques suffer from performance degradation because of mismatched noise models. In this paper, a novel robust DOA estimation algorithm is developed as an initial investigation into DOA estimation of speech under directional non-speech interference (DNSI) and non-directional background noise (NDBN) using an acoustic vector sensor (AVS), a compact co-incident microphone array. Specifically, by defining an intersensor data ratio model in the bispectrum domain (BISDR), the relationship between the BISDR and the speech DOA cues are derived. By recursively estimating a priori local signal-to-interference ratio of the bispectrum (B-PriLSIR), a robust speech-dominated binary mask (SDBM) is estimated and thus the speech DOA cue is faithfully extracted. Experimental results with simulated and recorded data demonstrate that the proposed algorithm offers high DOA estimation accuracy for all angles and is robust against DNSI and NDBN.

Authors


  •   Jin, Yanhan (external author)
  •   Zou, Yue-Xian (external author)
  •   Ritz, Christian H.

Publication Date


  • 2017

Citation


  • Jin, Y., Zou, Y. & Ritz, C. (2017). Robust speaker DOA estimation based on the inter-sensor data ratio model and binary mask estimation in the bispectrum domain. 2017 IEEE International Conference on Acoustics, Speech, and Signal Processing (pp. 3266-3270). United States: IEEE.

Scopus Eid


  • 2-s2.0-85023744707

Ro Metadata Url


  • http://ro.uow.edu.au/eispapers1/694

Start Page


  • 3266

End Page


  • 3270

Place Of Publication


  • United States

Abstract


  • When noise is directional instead of diffuse, the majority of conventional direction of arrival (DOA) estimation techniques suffer from performance degradation because of mismatched noise models. In this paper, a novel robust DOA estimation algorithm is developed as an initial investigation into DOA estimation of speech under directional non-speech interference (DNSI) and non-directional background noise (NDBN) using an acoustic vector sensor (AVS), a compact co-incident microphone array. Specifically, by defining an intersensor data ratio model in the bispectrum domain (BISDR), the relationship between the BISDR and the speech DOA cues are derived. By recursively estimating a priori local signal-to-interference ratio of the bispectrum (B-PriLSIR), a robust speech-dominated binary mask (SDBM) is estimated and thus the speech DOA cue is faithfully extracted. Experimental results with simulated and recorded data demonstrate that the proposed algorithm offers high DOA estimation accuracy for all angles and is robust against DNSI and NDBN.

Authors


  •   Jin, Yanhan (external author)
  •   Zou, Yue-Xian (external author)
  •   Ritz, Christian H.

Publication Date


  • 2017

Citation


  • Jin, Y., Zou, Y. & Ritz, C. (2017). Robust speaker DOA estimation based on the inter-sensor data ratio model and binary mask estimation in the bispectrum domain. 2017 IEEE International Conference on Acoustics, Speech, and Signal Processing (pp. 3266-3270). United States: IEEE.

Scopus Eid


  • 2-s2.0-85023744707

Ro Metadata Url


  • http://ro.uow.edu.au/eispapers1/694

Start Page


  • 3266

End Page


  • 3270

Place Of Publication


  • United States