Due to the growing number of unlabeled documents, it is becoming important to develop unsupervised methods capable of automatically extracting information. Topic models and neural networks represent two such methods, and parameter approximation algorithms are typically employed to estimate the parameters because it is not possible precisely to compute the parameters when using these methods. One of the well-known weaknesses of these approximation algorithms is that they do not find the global optimum but instead find one of many local optima. It is also known that initialization of the parameters affects the results of the parameter approximation process. In this paper, we hypothesize that the order of data class is also a factor that affects the parameter approximation results. Through digit recognition experiments with MNIST data, we prove that this hypothesis is valid and argue that it will be better always to use fully shuffled data to avoid incorrect conclusions.