Spontaneous reactions is assumed to play a vital role in making realistic human-agent or agent-agent interaction. For the spontaneity, the importance of abilities to predict action and to control reaction speed were investigated. The suggested data-driven approach used action-reaction pairs that are temporal skeleton information of two persons captured from a depth camera. The reactions synchronized with or faster than actions were made by learning the data with artificial neural networks. One part of networks predicted action pose at a time step, and the other created an interaction representation, corresponding to the action pose, which is the difference from the action pose to a reaction pose. The results showed that the synchronized and faster reaction with a few steps of valid action prediction could afford a virtual agent a certain extent of spontaneity.