Multi-exit architecture is a promising solution that can make adaptive predictions via early exits, depending on the current test-time budget which may vary over time in practice (e.g., self-driving cars with dynamically changing speeds). Compared to the previous works where each block is optimized to minimize the losses of all exits simultaneously, we propose a new method for training multi-exit architectures by imposing different objectives to individual blocks. Our key idea is to design block-dependent losses based on grouping and overlapping strategies, which enables each k-th block
to focus more on reducing the loss of the adjacent k-th exit while not degrading the prediction performance at later exits. This improves the prediction performance at the earlier exits, making our scheme to be more suitable for low-latency applications with a tight test-time budget. Experimental results on both image classification and semantic segmentation confirm the advantage of our approach for anytime prediction.