Simulating protein folding thermodynamics starting purely from a protein sequence is a grand challenge of computational biology. Here, we present an algorithm to calculate a canonical distribution from molecular dynamics simulation of protein folding. This algorithm is based on the replica exchange method where the kinetic trapping problem is overcome by exchanging noninteracting replicas simulated at different temperatures. Our algorithm uses multiplexed-replicas with a number of independent molecular dynamics runs at each temperature. Exchanges of configurations between these multiplexed-replicas are also tried, rendering the algorithm applicable to large-scale distributed computing (i.e., highly heterogeneous parallel computers with processors having different computational power). We demonstrate the enhanced sampling of this algorithm by simulating the folding thermodynamics of a 23 amino acid miniprotein. We show that better convergence is achieved compared to constant temperature molecular dynamics simulation, with an efficient scaling to large number of computer processors. Indeed, this enhanced sampling results in (to our knowledge) the first example of a replica exchange algorithm that samples a folded structure starting from a completely unfolded state.