어텐션 기반 엔드투엔드 음성인식 시각화 분석 Visual analysis of attention-based end-to-end speech recognition

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 946
  • Download : 287
DC FieldValueLanguage
dc.contributor.author임성민ko
dc.contributor.author구자현ko
dc.contributor.author김회린ko
dc.date.accessioned2019-07-11T07:30:14Z-
dc.date.available2019-07-11T07:30:14Z-
dc.date.created2019-06-12-
dc.date.created2019-06-12-
dc.date.created2019-06-12-
dc.date.issued2019-03-
dc.identifier.citation말소리와 음성과학, v.11, no.1, pp.41 - 49-
dc.identifier.issn2005-8063-
dc.identifier.urihttp://hdl.handle.net/10203/263262-
dc.description.abstractAn end-to-end speech recognition model consisting of a single integrated neural network model was recently proposed. The end-to-end model does not need several training steps, and its structure is easy to understand. However, it is difficult to understand how the model recognizes speech internally. In this paper, we visualized and analyzed the attention-based end-to-end model to elucidate its internal mechanisms. We compared the acoustic model of the BLSTM-HMM hybrid model with the encoder of the end-to-end model, and visualized them using t-SNE to examine the difference between neural network layers. As a result, we were able to delineate the difference between the acoustic model and the end-to-end model encoder. Additionally, we analyzed the decoder of the end-to-end model from a language model perspective. Finally, we found that improving end-to-end model decoder is necessary to yield higher performance.-
dc.languageKorean-
dc.publisher한국음성학회-
dc.title어텐션 기반 엔드투엔드 음성인식 시각화 분석-
dc.title.alternativeVisual analysis of attention-based end-to-end speech recognition-
dc.typeArticle-
dc.type.rimsART-
dc.citation.volume11-
dc.citation.issue1-
dc.citation.beginningpage41-
dc.citation.endingpage49-
dc.citation.publicationname말소리와 음성과학-
dc.identifier.doi10.13064/KSSS.2019.11.1.041-
dc.identifier.kciidART002453501-
dc.contributor.localauthor김회린-
dc.description.isOpenAccessY-
Appears in Collection
EE-Journal Papers(저널논문)
Files in This Item
109888.pdf(751.77 kB)Download

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0