AASIST: AUDIO ANTI-SPOOFING USING INTEGRATED SPECTRO-TEMPORAL GRAPH ATTENTION NETWORKS

Cited 46 time in webofscience Cited 0 time in scopus
  • Hit : 309
  • Download : 0
Artefacts that differentiate spoofed from bona-fide utterances can reside in specific temporal or spectral intervals. Their reliable detection usually depends upon computationally demanding ensemble systems where each subsystem is tuned to some specific artefacts. We seek to develop an efficient, single system that can detect a broad range of different spoofing attacks without score-level ensembles. We propose a novel heterogeneous stacking graph attention layer that models artefacts spanning heterogeneous temporal and spectral intervals with a heterogeneous attention mechanism and a stack node. With a new max graph operation that involves a competitive mechanism and a new readout scheme, our approach, named AASIST, outperforms the current state-of-the-art by 20% relative. Even a lightweight variant, AASIST-L, with only 85k parameters, outperforms all competing systems.
Publisher
Institute of Electrical and Electronics Engineers Inc.
Issue Date
2022-05
Language
English
Citation

47th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2022, pp.2405 - 2409

ISSN
1520-6149
DOI
10.1109/ICASSP43922.2022.9747766
URI
http://hdl.handle.net/10203/299796
Appears in Collection
EE-Conference Papers(학술회의논문)
Files in This Item
There are no files associated with this item.
This item is cited by other documents in WoS
⊙ Detail Information in WoSⓡ Click to see webofscience_button
⊙ Cited 46 items in WoS Click to see citing articles in records_button

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0