Automatic thesaurus construction is accomplished by extracting term relations mechanically. A popular method uses statistical analysis to discover the term relations. For low-frequency terms, however, the statistical information of the arms cannot be reliably used for deciding the relationship of terms. This problem is generally referred to as the data-sparseness problem. Unfortunately, many studies have shown that low-frequency terms are of most use in thesaurus construction. This paper characterizes the statistical behavior of terms by using an inference network. A formal approach for the data-sparseness problem, which is crucial in constructing a thesaurus, is developed. The validity of this approach is shown by experiments. Copyright (C) 1996 Elsevier Science Ltd