A two-stage precoder is widely considered in frequency division duplex massive multiple-input and multiple-output (MIMO) systems to resolve the channel feedback overhead problem. In massive MIMO systems, users on a network can be divided into several user groups of similar spatial antenna correlations. Using the two-stage precoder, the outer precoder reduces the channel dimensions mitigating inter-group interferences at the first stage, while the inner precoder eliminates the smaller dimensions of intra-group interferences at the second stage. In this case, the dimension of effective channel reduced by outer precoder is important as it leverages the inter-group interference, the intra-group interference, and the performance loss from the quantized channel feedback. In this paper, we propose the machine learning framework to find the optimal dimensions reduced by the outer precoder that maximizes the average sum rate, where the original problem is an NP-hard problem. Our machine learning framework considers the deep neural network, where the inputs are channel statistics, and the outputs are the effective channel dimensions after outer precoding. The numerical result shows that our proposed machine learning-based dimension optimization achieves the average sum rate comparable to the optimal performance using brute-forcing searching, which is not feasible in practice.