Skip to main content
placeholder image

High Quality Audio Adversarial Examples Without Using Psychoacoustics

Chapter


Abstract


  • In the automatic speech recognition (ASR) domain, most, if not all, current audio AEs are generated by applying perturbations to input audio. Adversaries either constrain norm of the perturbations or hide perturbations below the hearing threshold based on psychoacoustics. These two approaches have their respective problems: norm-constrained perturbations will introduce noticeable noise while hiding perturbations below the hearing threshold can be prevented by deliberately removing inaudible components from audio. In this paper, we present a novel method of generating targeted audio AEs. The perceptual quality of our audio AEs are significantly better compared to audio AEs generated by applying norm-constrained perturbations. Furthermore, unlike approaches that rely on psychoacoustics to hide perturbations below the hearing threshold, we show that our audio AEs can still be successfully generated even when inaudible components are removed from audio.

Publication Date


  • 2022

Edition


Citation


  • Zong, W., Chow, Y. W., & Susilo, W. (2022). High Quality Audio Adversarial Examples Without Using Psychoacoustics. In Unknown Book (Vol. 13547 LNCS, pp. 163-177). doi:10.1007/978-3-031-18067-5_12

International Standard Book Number (isbn) 13


  • 9783031180668

Scopus Eid


  • 2-s2.0-85140465960

Web Of Science Accession Number


Book Title


  • Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Start Page


  • 163

End Page


  • 177

Place Of Publication


Abstract


  • In the automatic speech recognition (ASR) domain, most, if not all, current audio AEs are generated by applying perturbations to input audio. Adversaries either constrain norm of the perturbations or hide perturbations below the hearing threshold based on psychoacoustics. These two approaches have their respective problems: norm-constrained perturbations will introduce noticeable noise while hiding perturbations below the hearing threshold can be prevented by deliberately removing inaudible components from audio. In this paper, we present a novel method of generating targeted audio AEs. The perceptual quality of our audio AEs are significantly better compared to audio AEs generated by applying norm-constrained perturbations. Furthermore, unlike approaches that rely on psychoacoustics to hide perturbations below the hearing threshold, we show that our audio AEs can still be successfully generated even when inaudible components are removed from audio.

Publication Date


  • 2022

Edition


Citation


  • Zong, W., Chow, Y. W., & Susilo, W. (2022). High Quality Audio Adversarial Examples Without Using Psychoacoustics. In Unknown Book (Vol. 13547 LNCS, pp. 163-177). doi:10.1007/978-3-031-18067-5_12

International Standard Book Number (isbn) 13


  • 9783031180668

Scopus Eid


  • 2-s2.0-85140465960

Web Of Science Accession Number


Book Title


  • Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Start Page


  • 163

End Page


  • 177

Place Of Publication