Skip to main content
placeholder image

Encoding multiple audio objects using intra-object sparsity

Journal Article


Abstract


  • Preserving audio scenes in the form of audio objects has become common in recent years. Object-based audio techniques provide more flexibility for personalized rendering as well as a more accurate audio object trajectory. For encoding and transmitting multiple audio objects in a lossy manner, a new compression framework for multiple simultaneously occurring audio objects is presented in this work. The proposed encoding approach is based on the intra-object sparsity (approximate k -sparsity). After establishing a quantitative measure of approximate k -sparsity, statistical analysis is employed to validate the proposed intra-object sparsity of audio objects. By exploring this intra-object sparsity, multiple simultaneously occurring audio objects are compressed into a mono downmix signal with side information. This downmix signal can be further compressed by legacy audio codecs. Meanwhile, the side information is transmitted in a lossless manner. The objective and subjective evaluations revealed that the proposed compression framework achieved better perceptual quality compared to an existing technique where up to eight audio objects are considered. The subjective evaluations also confirmed that the proposed approach is able to achieve scalable transmission according to the bandwidth while preserving the perceptual quality of both the individual audio objects and the spatial audio scenes.

Authors


  •   Jia, Maoshen (external author)
  •   Yang, Ziyu (external author)
  •   Bao, Changchun (external author)
  •   Zheng, Xiguang (external author)
  •   Ritz, Christian H.

Publication Date


  • 2015

Citation


  • M. Jia, Z. Yang, C. Bao, X. Zheng & C. Ritz, "Encoding multiple audio objects using intra-object sparsity," IEEE Transactions on Audio, Speech and Language Processing, vol. 23, (6) pp. 1082-1095, 2015.

Scopus Eid


  • 2-s2.0-84928486986

Ro Metadata Url


  • http://ro.uow.edu.au/eispapers/4350

Number Of Pages


  • 13

Start Page


  • 1082

End Page


  • 1095

Volume


  • 23

Issue


  • 6

Abstract


  • Preserving audio scenes in the form of audio objects has become common in recent years. Object-based audio techniques provide more flexibility for personalized rendering as well as a more accurate audio object trajectory. For encoding and transmitting multiple audio objects in a lossy manner, a new compression framework for multiple simultaneously occurring audio objects is presented in this work. The proposed encoding approach is based on the intra-object sparsity (approximate k -sparsity). After establishing a quantitative measure of approximate k -sparsity, statistical analysis is employed to validate the proposed intra-object sparsity of audio objects. By exploring this intra-object sparsity, multiple simultaneously occurring audio objects are compressed into a mono downmix signal with side information. This downmix signal can be further compressed by legacy audio codecs. Meanwhile, the side information is transmitted in a lossless manner. The objective and subjective evaluations revealed that the proposed compression framework achieved better perceptual quality compared to an existing technique where up to eight audio objects are considered. The subjective evaluations also confirmed that the proposed approach is able to achieve scalable transmission according to the bandwidth while preserving the perceptual quality of both the individual audio objects and the spatial audio scenes.

Authors


  •   Jia, Maoshen (external author)
  •   Yang, Ziyu (external author)
  •   Bao, Changchun (external author)
  •   Zheng, Xiguang (external author)
  •   Ritz, Christian H.

Publication Date


  • 2015

Citation


  • M. Jia, Z. Yang, C. Bao, X. Zheng & C. Ritz, "Encoding multiple audio objects using intra-object sparsity," IEEE Transactions on Audio, Speech and Language Processing, vol. 23, (6) pp. 1082-1095, 2015.

Scopus Eid


  • 2-s2.0-84928486986

Ro Metadata Url


  • http://ro.uow.edu.au/eispapers/4350

Number Of Pages


  • 13

Start Page


  • 1082

End Page


  • 1095

Volume


  • 23

Issue


  • 6