Skip to main content
placeholder image

Towards Visualizing and Detecting Audio Adversarial Examples for Automatic Speech Recognition

Chapter


Abstract


  • Automatic speech recognition (ASR) systems are now ubiquitous in many commonly used applications, as various commercial products rely on ASR techniques, which are increasingly based on machine learning, to transcribe voice commands into text for further processing. However, audio adversarial examples (AEs) have emerged as a serious security threat, as they have been shown to be able to fool ASR models into producing incorrect results. Although there are proposed methods to defend against audio AEs, the intrinsic properties of audio AEs compared with benign audio have not been well studied. In this paper, we show that the machine learning decision boundary patterns around audio AEs and benign audio are fundamentally different. In addition, using dimensionality reduction techniques, we show that these different patterns can be distinguished visually in 2D space. Based on dimensionality reduction results, this paper also demonstrates that it is feasible to detect previously unknown audio AEs using anomaly detection methods.

Publication Date


  • 2021

Citation


  • Zong, W., Chow, Y. W., & Susilo, W. (2021). Towards Visualizing and Detecting Audio Adversarial Examples for Automatic Speech Recognition. In Unknown Book (Vol. 13083 LNCS, pp. 531-549). doi:10.1007/978-3-030-90567-5_27

International Standard Book Number (isbn) 13


  • 9783030905668

Scopus Eid


  • 2-s2.0-85120059574

Book Title


  • Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Start Page


  • 531

End Page


  • 549

Abstract


  • Automatic speech recognition (ASR) systems are now ubiquitous in many commonly used applications, as various commercial products rely on ASR techniques, which are increasingly based on machine learning, to transcribe voice commands into text for further processing. However, audio adversarial examples (AEs) have emerged as a serious security threat, as they have been shown to be able to fool ASR models into producing incorrect results. Although there are proposed methods to defend against audio AEs, the intrinsic properties of audio AEs compared with benign audio have not been well studied. In this paper, we show that the machine learning decision boundary patterns around audio AEs and benign audio are fundamentally different. In addition, using dimensionality reduction techniques, we show that these different patterns can be distinguished visually in 2D space. Based on dimensionality reduction results, this paper also demonstrates that it is feasible to detect previously unknown audio AEs using anomaly detection methods.

Publication Date


  • 2021

Citation


  • Zong, W., Chow, Y. W., & Susilo, W. (2021). Towards Visualizing and Detecting Audio Adversarial Examples for Automatic Speech Recognition. In Unknown Book (Vol. 13083 LNCS, pp. 531-549). doi:10.1007/978-3-030-90567-5_27

International Standard Book Number (isbn) 13


  • 9783030905668

Scopus Eid


  • 2-s2.0-85120059574

Book Title


  • Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Start Page


  • 531

End Page


  • 549