Skip to main content
placeholder image

A psychoacoustic-based analysis-by-synthesis scheme for jointly encoding multiple audio objects into independent mixtures

Conference Paper


Abstract


  • Perceptually accurate representation of audio objects obtained from multi-track audio signals is desired for applications such as interactive soundfield rendering and browsing. Presented in this work is a scalable psychoacoustic analysis-by-synthesis approach to extract the perceptually dominant time-frequency audio objects from a multi-track audio signal. The proposed compression framework exploits sparsity in the perceptual time-frequency domain where up to eight audio objects can be efficiently encoded using only two audio mixtures with side information representing the origin of the time-frequency instances in the mixture signals. The proposed approach, judged by both objective and subjective tests, results in superior audio quality compared to existing techniques when encoding more than 5 audio objects.

Publication Date


  • 2013

Citation


  • Zheng, X., Ritz, C. & Xi, J. (2013). A psychoacoustic-based analysis-by-synthesis scheme for jointly encoding multiple audio objects into independent mixtures. IEEE International Conference on Acoustics, Speech and Signal Processing (pp. 281-285). United States: Institute of Electrical and Electronics Engineers.

Scopus Eid


  • 2-s2.0-84890469373

Ro Metadata Url


  • http://ro.uow.edu.au/eispapers/1943

Start Page


  • 281

End Page


  • 285

Abstract


  • Perceptually accurate representation of audio objects obtained from multi-track audio signals is desired for applications such as interactive soundfield rendering and browsing. Presented in this work is a scalable psychoacoustic analysis-by-synthesis approach to extract the perceptually dominant time-frequency audio objects from a multi-track audio signal. The proposed compression framework exploits sparsity in the perceptual time-frequency domain where up to eight audio objects can be efficiently encoded using only two audio mixtures with side information representing the origin of the time-frequency instances in the mixture signals. The proposed approach, judged by both objective and subjective tests, results in superior audio quality compared to existing techniques when encoding more than 5 audio objects.

Publication Date


  • 2013

Citation


  • Zheng, X., Ritz, C. & Xi, J. (2013). A psychoacoustic-based analysis-by-synthesis scheme for jointly encoding multiple audio objects into independent mixtures. IEEE International Conference on Acoustics, Speech and Signal Processing (pp. 281-285). United States: Institute of Electrical and Electronics Engineers.

Scopus Eid


  • 2-s2.0-84890469373

Ro Metadata Url


  • http://ro.uow.edu.au/eispapers/1943

Start Page


  • 281

End Page


  • 285