LI-TTA: Language Informed Test-Time Adaptation for Automatic Speech RecognitionLI-TTA: Language Informed Test-Time Adaptation for Automatic Speech Recognition

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 9
  • Download : 0
Test-Time Adaptation (TTA) has emerged as a crucial solution to the domain shift challenge, wherein the target environment diverges from the original training environment. A prime exemplification is TTA for Automatic Speech Recognition (ASR), which enhances model performance by leveraging output prediction entropy minimization as a self-supervision signal. However, a key limitation of this self-supervision lies in its primary focus on acoustic features, with minimal attention to the linguistic properties of the input. To address this gap, we propose Language Informed Test-Time Adaptation (LI-TTA), which incorporates linguistic insights during TTA for ASR. LITTA integrates corrections from an external language model to merge linguistic with acoustic information by minimizing the CTC loss from the correction alongside the standard TTA loss. With extensive experiments, we show that LI-TTA effectively improves the performance of TTA for ASR in various distribution shift situations. The code is publicly accessible at https://github.com/EsYoon7/LiTTA.
Publisher
25th Interspeech Conference
Issue Date
2024-09
Language
English
Citation

25th Interspeech Conference

URI
http://hdl.handle.net/10203/323311
Appears in Collection
EE-Conference Papers(학술회의논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0