In future computing systems (or infrastructures), which are based not only on regular direct networks such as mesh or ring but also on switch-based irregular networks and mobile ad hoc networks,$\emph{collective communication}$ is a crucial issue because the collective communication among multiple nodes in a cooperating group usually constitutes the sequential or bottleneck park of parallel and mobile computing applications. In this research, we investigate two of the most important collective communication primitives, $\emph{barrier synchronization}$ and $\emph{multicast}$, which are easily applied to other collective communication functions such as total exchange or reduction.
First, we propose a $\emph{Barrier Tree for Meshes (BTM)$}$ to minimize the barrier synchronization latency for two-dimensional (2D) mesh systems. The proposed BTM scheme has two distinguishing features. One is that the synchronization tree is 4-ary. The synchronization latency of the BTM scheme is asymptotically $\theta$($log_4n$) while that of the fastest scheme reported in the literature is bounded between $\Omega$($log_3n$) and O($n^{1/2}$)where n is the number of member nodes. The other is that nonmember nodes are neither involved in the construction of a BTM nor actively participate in the synchronization operations, which avoids interference among different process groups during synchronization. Extensive simulation study shows that, for up to $64\times64$ meshes, the BTM scheme results in about 40~70% shorter synchronization latency, and is more scalable thatn conventional schemes.
Second, we propose a $\emph{Barrier Tree for Irregular Networks (BTIN)}$ and a barrier synchronization scheme using BTIN for switch-based cluster systems. The synchronization latency of the proposed BTIN scheme is asymptotically O(log n) while that of the fastest scheme reported in the literature is bounded by O(n), where n is the number of member nodes. According to our simulation results, for th...