These days Bayesian network models are popular with the AI, medicine, biology, education and statistics communities. They have complicate expression of independence which takes into account the directionality of the arcs. By contrast, graphical models are graphs in which nodes represent random variables, and the lack of arcs represent conditional independence assumption. graphical log-linear models have a simple definition of independence: two(sets of) nodes A and B are conditionally independent given a third set, C, if all paths between the nodes in A and B are separated by a node in C.
We want to change a Bayesian network model for a set of a large number of random variables that are assumed to be causally related into a marginal log-linear model.
Dealing with the whole data set that have large number of variables would be time consuming and lead us to models that are far away from the true model. When we build a model based on a data set for a large set of random variables, it is desirable to divide the whole set of variables into several subsets of variables of manageable sizes.
It is very important to arrange the variables into subgroups so that variables are associated among themselves within subgroups than between subgroups. Classification and regression tree algorithm is useful for grouping.
Once we choose the subgroups of random variables, we apply log-linear modelling to individual groups and obtain graphical log-linear models whose model structures are representable via graphs of vertices and edges.
We find particular types of graph separators called prime separators which are defined as a graph separator which separates cliques or irreducible cycles. The prime separators have a good property that they remain as prime separators both in a graphical model and its marginal model. This property is used in combining marginal models of a graphical log-linear model.
Finally we compare the combined model wit...