Efficient harmonic peak detection of vowel sounds for enhanced voice activity detection

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 59
  • Download : 0
Voice activity detection (VAD) involves discriminating speech segments from background noise and is a critical step in numerous speech-related applications. However, distinguishing speech from noise based on the properties of noise is fallible, because it is difficult to predict and characterise the noise occurring in real life. In this study, the authors instead focus on the intrinsic characteristics of speech. The harmonic peaks of vowel sounds have higher energies than the other spectral components of speech and are the speech features most likely to survive in most cases of severe noise. Therefore, the energy differences between harmonic peaks and other spectral features show promise for enabling robust VAD. To exploit this feature, the harmonic peaks must be accurately located. For this purpose, this study proposes an efficient harmonic peak location detection (HPD) method. Based on extensive experiments conducted in the presence of various noise types and signal-to-noise ratios, we found that VAD with the proposed HPD approach outperforms existing VAD methods and does so with reasonable computational cost and higher robustness.
Publisher
INST ENGINEERING TECHNOLOGY-IET
Issue Date
2018-10
Language
English
Article Type
Article
Keywords

LIKELIHOOD RATIO TEST; SPEECH RECOGNITION; FREQUENCY; INFORMATION

Citation

IET SIGNAL PROCESSING, v.12, no.8, pp.975 - 982

ISSN
1751-9675
DOI
10.1049/iet-spr.2017.0553
URI
http://hdl.handle.net/10203/246575
Appears in Collection
BiS-Journal Papers(저널논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0