a representative characteristic of the class, and the canonical concept is one of the key factors to recognize objects under various perturbations, e.g., geometric, illumination, photometric, and intra-class variations, as well as adaptation to unseen classes. If a canonical concept can be obtained by neutralizing various perturbations from images, we can apply the learned canonical concept to recognize unseen classes only with a few samples. Based on the assumption of canonical concept, we propose approaches that neutralize various perturbations to extract a canonical concept by learning generalizable representation space.
To this end, graphic symbols are used to validate the proposed approaches where a canonical concept is clearly defined as prototypes. In the experiments, prototypes are utilized as a canonical concept. The contributions of this dissertation are as follows:
(1) A metric-learning based method is proposed to learn the relation between prototypes (canonical concept) and real images. By utilizing prototypes in training, a better representation can be learned, resulting in higher performance on one-shot and few-shot classification.
(2) A generative model-based learning method is proposed to learn the neutralizing process by visual composition. In the experiments, we demonstrate the representation learned by generative loss is more generalizable than representations learned by metric-learning based approaches.
(3) A class-agnostic relative transformation estimation method is proposed to neutralize geometric perturbations. The relative transformer networks can inference geometric transformations of arbitrary objects, thereby, applicable to few-shot tasks. We validate our approach on the one-shot classification task under high geometric variations.; With the recent advancements in deep learning, image recognition approaches nowadays surpass human performance. However, most of the state-of-the-art approaches require large datasets which is notorious for acquiring. Learning a model with few samples or generalizable representation that generalizes to unseen data is required when a large dataset for the task is not available. Solving a task with limited data is still an open problem. In contrast, humans can quickly learn a new concept with a few examples or experiences. This is available due to two aspects of intelligence. One is general knowledge accumulated from a large amount of experience and information. The other is a fast adaptation ability that exploits general knowledge to learn new concepts.
Inspired by human's learning ability, learning approaches that generalize to tasks with limited data are receiving attention. In image classification, classifying an object with few samples is called few-shot learning. During the training phase, a model learns general knowledge from the training set. In the test phase, the trained model performs a task, i.e., classifying unseen classes with a few samples given per class. There are two directions of research related to few-shot learning. One is learning a general representation space, which is also discriminative to unseen classes; thereby, instant adaptation is available. Another approach is called meta-learning or learning-to-learn, a method to learn how to update a model with few samples. Meta-learning methods train a model with few samples, which is affected by the initial model capability. In this sense, learning a model with general representation is a fundamental problem in few-shot learning.
In this dissertation, we propose approaches to learn general representation space. We postulate one hypothesis that objects of each class share a canonical concept