In this paper, we propose a packet-level parallel data transfer and a Two-Phase Scheduling(TPS) algorithm for collective communication primitives in MPICH-G2. The algorithms are characterized by two unique features: 1) a concurrent data transfer of packets from a source node to multiple destination nodes and 2) a scheduling of enhancing the performance of collective commu- nications by early identification of bottleneck incurring nodes. The proposed technique is implemented and the performance improvement is measured. Ac- cording to the performance evaluation, the proposed method has achieved about 20% performance improvement against conventional block data transfer meth- ods when a binomial tree is used for the communication in LAN. In TPS algo- rithm, the distribution of messages to bottleneck incurring nodes is delayed to minimize the affection of the node to the total performance. Using TPS algo- rithm on WAN, significant performance improvement has also been achieved for various data sizes and number of nodes.