As access to data becomes more comfortable, many tasks have been done to extract patterns from a large amount of collected data. After clustering as a representative process for such analysis, there is a method to analyze each mixture component. Here, the mixture component is an individual distribution that constitutes a mixture distribution and represents data locally. Therefore, the mixture component analysis is a summary of typical patterns of data by interpreting the individual components identified from the model. Since analyzing the data individually requires extensive time and labor, this clustering task is essential in many real-world applications.
Beyond merely learning the mixture components at the flat level, many studies have been investigated recently to study the mixture components having hierarchical relationships. Hierarchical mixture components are hierarchically organized between components. From an analyst’s perspective, components can be identified by their abstraction levels from the most general components to the most specific components. Depending on the purpose of the analysis, the levels of elaboration of data summaries required by contexts can be quite different. In this case, hierarchical mixture component analysis is appropriate. Besides, since the components are organized hierarchically, the amount of information that can be acquired is abundant, and more improved interpretability is possible.
In this dissertation, I present several hierarchical mixture modelings based on the nonparametric Bayesian approach. As various types of data exist, I suggest various application models which improve or extend the existing hierarchical mixture models, considering the characteristics of each type. The ultimate goal is not only to improve performance but to extend the functionality of the existing hierarchical mixture modeling and to improve utilities. The first study involved the hierarchical mixture modeling for discrete data without label information, and it proposes an application model that reflects user domain knowledge in order to improve the existing models. The second study involved the hierarchical mixture modeling for discrete data with label information. A supervised extension of hierarchical mixture modeling is possible. The third study involved the hierarchical mixture modeling for high dimensional continuous data. It aims to optimize the divisive hierarchical clustering tasks with low dimensional embedding learning simultaneously.
The first study is the hierarchical mixture modeling based on user domain knowledge by applying the Dirichlet forest prior. It is a model that can be applied to data such as discrete data without the label (supervision) information, which are documents with plain text content. The existing model may not be intuitive for users to interpret because many mixture components with mixed contents are generated. However, if a minimum level of the detailed area that the user want to see has been set up, it purposes to infer the user interpretable hierarchical mixture components by reflecting user domain knowledge in seed word set form.
The second study is the hierarchical mixture modeling for discrete data with the label information. Unlike the unsupervised learning in the first study, this study is an example of the supervised learning. The prescription can be an example of this kind of data. It attempted to analyze it by treating prescription medicines as document word and symptoms as document label. It studied the extended concept of hierarchical mixture component that includes information about not only medicines but also symptoms. Based on the information that can be deduced from the model, the study was conducted to detect the anomalies using the various meta information of prescription.
The third study is a study of the representation learning and the hierarchical mixture modeling simultaneously under the framework of autoencoder, which is a representative model of unsupervised learning based on deep learning. It applies to continuous real number high dimensional data, such as word embedding or images. This study is the first to suggest the hierarchical mixture density estimation in neural network embedding space in this academic circle. The hierarchical clustering in low dimension has higher hierarchical clustering accuracy than performing the hierarchical clustering directly in high dimension, or dimensionally reducing to low dimension and then performing the hierarchical clustering. Moreover, the study demonstrates empirically that this hierarchical latent structure extraction has representation power over the conventional autoencoder based models.
The studies carried out in this dissertation suggest an improved or extended hierarchical mixed modeling for adaptation to various types of data, with a common denominator of the hierarchical mixture modeling based on the nonparametric Bayesian approach. As acquiring a large amount of data became easier, how to summarize the prominent characteristics of the data eventually emerged as an important analytical issue. Since the level of abstraction particularly required to summarize the data depends on the situation, and the intuitive and structured information extraction is very useful for analysts, in this respect, the hierarchical mixture modeling is an important task that needs continuous research in the future.