Skip to main content
placeholder image

A general compression approach to multi-channel three-dimensional audio

Journal Article


Download full-text (Open Access)

Abstract


  • This paper presents a technique for low bit rate compression of three-dimensional (3D) audio produced by multiple loudspeaker channels. The approach is based on the time-frequency analysis of the localization of spatial sound sources within the 3D space as rendered by a multi-channel audio signal (in this case 16 channels). This analysis results in the derivation of a stereo downmix signal representing the original 16 channels. Alternatively, a mono-downmix signal with side information representing the location of sound sources within the 3D spatial scene can also be derived. The resulting downmix signals are then compressed with a traditional audio coder, resulting in a representation of the 3D soundfield at bit rates comparable with existing stereo audio coders while maintaining the perceptual quality produced from separate encoding of each channel. © 2006-2012 IEEE.

Authors


  •   Cheng, Bin (external author)
  •   Ritz, Christian H.
  •   Burnett, Ian S. (external author)
  •   Zheng, Xiguang (external author)

Publication Date


  • 2013

Citation


  • B. Cheng, C. Ritz, I. S. Burnett & X. Zheng, "A general compression approach to multi-channel three-dimensional audio," IEEE Transactions on Audio, Speech and Language Processing, vol. 21, (8) pp. 1676-1688, 2013.

Scopus Eid


  • 2-s2.0-84877867161

Ro Full-text Url


  • http://ro.uow.edu.au/cgi/viewcontent.cgi?article=2020&context=eispapers

Ro Metadata Url


  • http://ro.uow.edu.au/eispapers/1011

Number Of Pages


  • 12

Start Page


  • 1676

End Page


  • 1688

Volume


  • 21

Issue


  • 8

Abstract


  • This paper presents a technique for low bit rate compression of three-dimensional (3D) audio produced by multiple loudspeaker channels. The approach is based on the time-frequency analysis of the localization of spatial sound sources within the 3D space as rendered by a multi-channel audio signal (in this case 16 channels). This analysis results in the derivation of a stereo downmix signal representing the original 16 channels. Alternatively, a mono-downmix signal with side information representing the location of sound sources within the 3D spatial scene can also be derived. The resulting downmix signals are then compressed with a traditional audio coder, resulting in a representation of the 3D soundfield at bit rates comparable with existing stereo audio coders while maintaining the perceptual quality produced from separate encoding of each channel. © 2006-2012 IEEE.

Authors


  •   Cheng, Bin (external author)
  •   Ritz, Christian H.
  •   Burnett, Ian S. (external author)
  •   Zheng, Xiguang (external author)

Publication Date


  • 2013

Citation


  • B. Cheng, C. Ritz, I. S. Burnett & X. Zheng, "A general compression approach to multi-channel three-dimensional audio," IEEE Transactions on Audio, Speech and Language Processing, vol. 21, (8) pp. 1676-1688, 2013.

Scopus Eid


  • 2-s2.0-84877867161

Ro Full-text Url


  • http://ro.uow.edu.au/cgi/viewcontent.cgi?article=2020&context=eispapers

Ro Metadata Url


  • http://ro.uow.edu.au/eispapers/1011

Number Of Pages


  • 12

Start Page


  • 1676

End Page


  • 1688

Volume


  • 21

Issue


  • 8