In this thesis, an improved SEW/REW decomposition method with pitchdependent phase generation and a noble variable bit rate (VBR) scheme are proposed to enhance the speech quality of the waveform interpolation (WI) coder and reduce the bit rate of the WI coder.
In the original WI scheme, a characteristic waveform (CW) is decomposed into a slowly evolving waveform (SEW) and a rapidly evolving waveform (REW) in Cartesian coordinates. This may deteriorate the spectral shape of the reconstructed CWs. Especially, speech quality degradation is inevitable when the REW contains SEW components. To solve this problem, the proposed decomposition is performed in the magnitude domain to reduce spectral distortions. The phase of the characteristic waveforms is generated after classifying the signal into silence, unvoiced and voiced speech using the pitch value. The proposed VBR scheme is achieved by substituting white Gaussian noises with the excitation signal of silence and unvoiced speech and allocating bit rates variably.
The performance of our proposed method was evaluated by the perceptual evaluation of speech quality (PESQ) score. The proposed CW modification results in the PESQ score improvement by 0.32 from the baseline speech quality, i.e., the PESQ score of 3.368. In addition, we confirmed that the required bit rate is decreased by 6.7% using the proposed novel VBR scheme. Experimental results show that our proposed algorithm achieves the improved speech quality while reducing the required bit rate compared to the conventional methods.