Identifying features that effectively represent the
energetic contribution of an individual interface
residue to the interactions between proteins
remains problematic. Here, we present several
new features and show that they are more effective
than conventional features. By combining the proposed
features with conventional features, we
develop a predictive model for interaction hot
spots. Initially, 54 multifaceted features, composed
of different levels of information including structure,
sequence and molecular interaction information,
are quantified. Then, to identify the best
subset of features for predicting hot spots, feature
selection is performed using a decision tree. Based
on the selected features, a predictive model for hot
spots is created using support vector machine
(SVM) and tested on an independent test set. Our
model shows better overall predictive accuracy
than previous methods such as the alanine scanning
methods Robetta and FOLDEF, and the
knowledge-based method KFC. Subsequent analysis
yields several findings about hot spots. As
expected, hot spots have a larger relative surface
area burial and are more hydrophobic than other
residues. Unexpectedly, however, residue conservation
displays a rather complicated tendency
depending on the types of protein complexes, indicating
that this feature is not good for identifying
hot spots. Of the selected features, the weighted
atomic packing density, relative surface area burial
and weighted hydrophobicity are the top 3, with
the weighted atomic packing density proving to
be the most effective feature for predicting
hot spots. Notably, we find that hot spots are
closely related to n–related interactions, especially
n n interactions.