An energy-efficient speech extraction (SE) processor is proposed for robust user speech recognition (SR) in head-mounted display (HMD) systems. User SE is essential for robust user SR in a noisy environment. For the low-latency SE, the FastSE algorithm is proposed to overcome the time-consuming constrained-independent-component-analysis-based user speech selection process, which results in < 2-ms SE latency. Moreover, a reinforced-FastSE scheme is proposed to achieve 97.2% accuracy with only 33-kB FastSE on-chip memory for the low-power HMD applications. Also, a reconfigurable matrix operation accelerator is implemented for the energy-efficient acceleration of the dominant matrix operation in SE. As a result, the proposed SE processor achieves 1.3x higher speed with 4.24x smaller memory compared to the state-of-the-art work, so SR in a noisy environment becomes possible for mobile HMD applications.