DSpace at KOASAS: (A) speech separation from multi-speaker dialogues under reverberant environment based on enhanced interaural coherence

DSpace at KOASAS

College of Engineering(공과대학)School of Mechanical and Aerospace Engineering(기계항공공학부)Dept. of Mechanical Engineering(기계공학과)ME-Theses_Master(석사논문)

(A) speech separation from multi-speaker dialogues under reverberant environment based on enhanced interaural coherence잔향이 있는 다중 화자 환경에서의 두 개의 마이크로폰을 이용한 코히런스 기반 음성 분리 기법

Cited 0 time in webofscience

Cited 0 time in scopus

Hit : 381
Download : 0

Export

Kim, Seong-Hu

The degenerate unmixing estimation technique (DUET) and model-based expectation-maximization source separation and localization (MESSL) separate the spectrogram based on the histogram. However, accurate histogram separation is difficult because the histogram is distributed around the actual source location and overlap due to the reverberation effect. In addition, since speech recognition performance is lower than that of speech without reverberation, only a direct speech having less reverberation influence should be extracted. In order to solve this problem, the interaural coherence proposed in the previous study is used to isolate spectrogram bins which have a large influence of reverberation. However, it does not apply sufficient ensemble averaging, so we can not exactly see the effect of reverberation. In this research, we tried to apply sufficient ensemble averaging by determining the quasi-steady state interval of speech and the Canny edge detection algorithm, which is used in image processing, is applied to the spectrogram image to determine this interval. Based on the determined interval, the optimal interaural coherence calculation method is applied so that the effect of the reverberation can be seen more accurately for the same resolution. In order to extract only the direct sound source with less effect of reverberation, we proposed a model in which the coherence is applied as a sigmoid function to the MESSL. As a result, we improve the speech separation performance by reducing the distribution of the histogram and extract only the spectrogram bins with less influence of the reverberation, so that the speech recognition performance deteriorates. As a result of this research, it is possible to improve the performance of multiple direct speech separation in a reverberant environment with a small number of microphones and apply it to a mobile device or a companion robot so as to provide better service through improved speech recognition performance.

Advisors: Park, Yong-Hwa researcher; 박용화 researcher

Description: 한국과학기술원 :기계공학과,

Publisher: 한국과학기술원

Issue Date: 2019

Identifier: 325007

Language: eng

Description: 학위논문(석사) - 한국과학기술원 : 기계공학과, 2019.2,[iv, 50 p. :]

Keywords: Time-frequency masking▼aspeech separation▼aspeech enhancement▼areverberation▼acoherence; 시간-주파수 마스킹▼a음성 분리▼a음성 품질 향상▼a잔향▼a코히런스

URI: http://hdl.handle.net/10203/265854

Link: http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=843010&flag=dissertation

Appears in Collection: ME-Theses_Master(석사논문)

Files in This Item: There are no files associated with this item.

Display Full Item Record

qr_code

트윗하기

KOASAS

Knowledge Service Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

(A) speech separation from multi-speaker dialogues under reverberant environment based on enhanced interaural coherence잔향이 있는 다중 화자 환경에서의 두 개의 마이크로폰을 이용한 코히런스 기반 음성 분리 기법

KOASAS

Communities & Collections