DSpace at KOASAS: Learning How Long to Wait: Adaptively-Constrained Monotonic Multihead Attention for Streaming ASR

DSpace at KOASAS

RIMS Collection RIMS Conference Papers

Learning How Long to Wait: Adaptively-Constrained Monotonic Multihead Attention for Streaming ASR

Cited 0 time in webofscience

Cited 0 time in

Hit : 102
Download : 0

Export

Song, Jaeyun / Shim, Hajin / Yang, Eunho researcher

Monotonic Multihead Attention, which allows multiple heads to learn their own alignments per head, shows great performance on simultaneous machine translation and streaming speech recognition. However, it causes high latency waiting for the slowest head. Some recent advances such as Head-Synchronous Beam Search Decoding and its learnable version Mutually-Constrained Monotonic Multihead Attention, try to address this issue by restricting the difference in times of chosen frames among multi-heads to a fixed waiting time threshold. In this paper, we hypothesis that the optimal threshold for high performance with low latency depends on the input sequence, and propose an adaptive algorithm that learns how long to wait depending on input tokens by introducing a threshold prediction module. We evaluate our approach on two benchmark datasets for online Automatic Speech Recognition task and demonstrate that our method reduces the latency together with even improving the recognition accuracy.

Publisher: Institute of Electrical and Electronics Engineers Inc.

Issue Date: 2021-12-15

Language: English

Citation: 2021 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2021, pp.441 - 448

DOI: 10.1109/ASRU51503.2021.9688138

URI: http://hdl.handle.net/10203/301615

Appears in Collection: AI-Conference Papers(학술대회논문)

Files in This Item: There are no files associated with this item.

Display Full Item Record

qr_code

트윗하기

KOASAS

Knowledge Service Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

Learning How Long to Wait: Adaptively-Constrained Monotonic Multihead Attention for Streaming ASR

KOASAS

Communities & Collections