Encoding features robust to unseen modes of variations with attentive recurrent neural networks주의 깊은 회귀 신경망 네트워크를 이용한 처음 보는 변화에 강인한 특징 표현 연구

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 160
  • Download : 0
Recurrent neural networks, particularly long short-term memory (LSTM) units, have been popular as an efficient tool for encoding dynamic features in sequences. While LSTM units have been fairly successful in encoding dynamic features, in practice the LSTM performance is affected by different types of variations unseen during the training. While it is possible to overcome this shortcoming by training the LSTM with data containing different modes of variation, the number of variation that could occur at test time is infinite. This makes it difficult to produce LSTM units robust to all types of variations. Hence it is important to devise a method for encoding dynamic features robust to unseen modes of variation. In this work, we first investigate the effect modes of variations have on the features encoded dynamic using LSTMs. We show that the LSTM retains information related to the mode of variation in the sequence, which is irrelevant to the task at hand. We experimentally show that the forget gate of the LSTM is designed to discard features temporally irrelevant to the task at hand. However, the forget gate is not designed to handle non-temporal variations. Encoding such variations into the dynamic features could substantially reduce the discriminability of the encoded features, especially when these variations are unseen during training time. To encode features robust to unseen variations, it is important to identify the variations apparent in the sequence at test time. To that end, we devise multiple LSTM adaptations and network architectures that would first identify and encode the type variation apparent in the test sequence. Then the proposed methods suppress the negative effect the modes of variation have on the encoded dynamic features. In this work we devise three approaches for encoding features robust to unseen modes of variation. The first two methods were designed as application specific methods to encode features robust to unseen modes of variation in Facial expression recognition task. Then a generalized LSTM adaptation named attentive mode vartiation LSTM was devised as a generalized and compact LSTM adaptation that generalizes to different types of features and applications. The proposed attentive mode variational LSTM unit has an input signal separator, which separates the input into two parts: (1) task-relevant dynamic sequence features and (2) task-irrelevant static sequence features. The task-relevant features are the input feature elements that contain the most dynamics related to the task at hand .The task-relevant features are used to encode and emphasize the dynamics in the input sequence. The task-irrelevant static sequence are the input feature elements that contain the least dynamics related to the task at hand. The task-irrelevant static sequence features are utilized to encode the mode of variation in the input sequence, regardless if they were seen or not during the training. The task-relevant dynamic sequence features and the task-irrelevant sequence features are processed in independent cell states to disentangle the effect of variation from the task-relevant dynamic features. The effect of the encoded variation is then suppressed with a shared output gate resulting in dynamic features robust to unseen variations. The effectiveness of the proposed method is verified using two tasks: facial expression recognition and human action recognition. Comprehensive and extensive experiments verified that the proposed method encodes dynamic features robust to variations unseen during the training.
Advisors
Ro, Yong Manresearcher노용만researcher
Description
한국과학기술원 :전기및전자공학부,
Publisher
한국과학기술원
Issue Date
2019
Identifier
325007
Language
eng
Description

학위논문(박사) - 한국과학기술원 : 전기및전자공학부, 2019.8,[iv, 57 p. :]

Keywords

Long short-term memory (LSTM)▼arecurrent neural networks (RNN)▼aattention▼arobust features▼amodes of variation▼afacial expression recognition▼ahuman action recognition; LSTM▼a순환신경망▼a관심 영역▼a강인한 특징▼a모드 변형▼a얼굴 표정 인식▼a인간 행동 인식

URI
http://hdl.handle.net/10203/283287
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=871461&flag=dissertation
Appears in Collection
EE-Theses_Ph.D.(박사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0