This paper proposes an efficient feature vector processing technique to guard the Speech Emotion Recognition (SER) system
against a variety of noises. In the proposed approach, emotional feature vectors are extracted from speech processed by comb
filtering. Then, these extracts are used in a robust model construction based on feature vector classification. We modify
conventional comb filtering by using speech presence probability to minimize drawbacks due to incorrect pitch estimation under
background noise conditions. The modified comb filtering can correctly enhance the harmonics, which is an important factor
used in SER. Feature vector classification technique categorizes feature vectors into either discriminative vectors or
non-discriminative vectors based on a log-likelihood criterion. This method can successfully select the discriminative vectors
while preserving correct emotional characteristics. Thus, robust emotion models can be constructed by only using such
discriminative vectors. On SER experiment using an emotional speech corpus contaminated by various noises, our approach
exhibited superior performance to the baseline system.