Coherence-based quantitative analysis of reverberation effect on english automatic speech recognition error잔향이 영어 음성인식 오류에 끼치는 영향의 코히런스 기반 정량적 분석

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 179
  • Download : 0
DC FieldValueLanguage
dc.contributor.advisorPark, Yong-Hwa-
dc.contributor.advisor박용화-
dc.contributor.authorNam, Hyeonuk-
dc.date.accessioned2021-05-13T19:31:11Z-
dc.date.available2021-05-13T19:31:11Z-
dc.date.issued2020-
dc.identifier.urihttp://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=910901&flag=dissertationen_US
dc.identifier.urihttp://hdl.handle.net/10203/284604-
dc.description학위논문(석사) - 한국과학기술원 : 기계공학과, 2020.2,[vii, 45 p. :]-
dc.description.abstractAutomatic speech recognition (ASR) is one of core techniques for human-machine interaction, yet it is too vulnerable to the external noises for real-life uses. Especially, reverberation has convolutive nature which reduces speech clarity to hinder ASR and make it very difficult to be removed from speeches recorded in reverberant environments. Therefore, improving ASR's robustness to reverberation is essential to applying ASR in various environments. In this research, as a precedent research to optimize ASR performance on reverberated speeches, effect of reverberation on ASR error is quantitatively analyzed using coherence. The ASR environment used in this research is in single-channel machine listening ASR in English language. Room impulse responses obtained in various reverberant conditions are convoluted with clean speeches from English language corpus to simulate reverberated speech. Coherence is used to measure the similarity between reverberated speech spectrograms and corresponding clean speech spectrogram at each time frame and frequency bin. A variable named mean phoneme coherence (MPC) is presented to quantify the spectral contamination of a phoneme in a reverberated speech. MPC of a phoneme is obtained by averaging the coherence values of time frames and frequency bins within the time interval where that phoneme is articulated. Spectral contamination of a phoneme is small when the phoneme’s MPC is close to one. On the other hand, spectral contamination is severe when the phoneme’s MPC is close to zero. By applying ASR to reverberated speeches and comparing MPC distributions of each phoneme in correctly and wrongly recognized words, it is shown that MPC values are statistically higher when phonemes belong to the correctly recognized words than when phonemes belong to wrongly recognized words. From this result, it is quantitatively verified that severe spectrum contamination upon reverberation leads to more ASR error. By comparing phoneme groups' MPC distributions, it is shown that stops increase ASR error rate the least while fricatives increase ASR error rate the most upon increase in spectral contamination. In addition, sequential interaction between phonemes is analyzed by grouping phonemes into voiced consonants, unvoiced consonants and vowels. Upon increase in spectral contamination, voiced consonants increase ASR error rate less when preceded by consonants. On the other hand, vowel and unvoiced consonants increase ASR error rate more when one precedes the other upon increase in spectral contamination. From such methodologies, physical interactions between phonemes and spectral contamination upon reverberation on English ASR error are quantitatively analyzed based on coherence.-
dc.languageeng-
dc.publisher한국과학기술원-
dc.subjectreverberation▼asingle-channel▼amachine listening▼aenglish language▼aautomatic speech recognition▼aquantitative▼aphoneme▼aspectral contamination▼acoherence▼amean phoneme coherence (MPC)-
dc.subject잔향▼a단일채널▼a기계청취▼a영어▼a음성인식▼a정량적▼a음소▼a스펙트로그램 오염▼a코히런스▼a평균 음소 코히런스 (MPC)-
dc.titleCoherence-based quantitative analysis of reverberation effect on english automatic speech recognition error-
dc.title.alternative잔향이 영어 음성인식 오류에 끼치는 영향의 코히런스 기반 정량적 분석-
dc.typeThesis(Master)-
dc.identifier.CNRN325007-
dc.description.department한국과학기술원 :기계공학과,-
dc.contributor.alternativeauthor남현욱-
Appears in Collection
ME-Theses_Master(석사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0