Unsupervised feature extraction for the representation and recognition of lip motion video

The lip-reading recognition is reported with lip-motion features extracted from multiple video frames by three unsupervised learning algorithms, i.e., Principle Component Analysis (PCA), Independent Component Analysis (ICA), and Non-negative Matrix Factorization (NMF). Since the human perception of facial motion goes through two different pathways, i.e., the lateral fusifom gyrus for the invariant aspects and the superior temporal sulcus for the changeable aspects of faces, we extracted the dynamic video features from multiple consecutive frames for the latter. The multiple-frame features require less number of coefficients for the same frame length than the single-frame static features. The ICA-based features are most sparse, while the corresponding coefficients for the video representation are the least sparse. PCA-based features have the opposite characteristics, while the characteristics of the NMF-based features are in the middle. Also the ICA-based features result in much better recognition performance than the others.
Publisher
SPRINGER-VERLAG BERLIN
Issue Date
2006
Language
ENG
Citation

COMPUTATIONAL INTELLIGENCE AND BIOINFORMATICS, PT 3, PROCEEDINGS BOOK SERIES: LECTURE NOTES IN COMPUTER SCIENCE, v.4115, pp.741 - 746

ISSN
0302-9743
URI
http://hdl.handle.net/10203/10213
Appears in Collection
EE-Journal Papers(저널논문)
  • Hit : 151
  • Download : 5
  • Cited 0 times in thomson ci
This item is cited by other documents in WoS
⊙ Detail Information in WoSⓡClick to seewebofscience_button

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0