Emotional events are thought to be more strongly encoded and retrieved than neutral events. Prior studies on emotional processing have reported an enhancing effect of emotion on the subjective vividness and accuracy of memories. However, it does seem that emotions have more than just a positive effect on memory. There is also abundant evidence supporting that emotional memories are subject to distortion despite their subjective vividness. To reconcile these conflicting results, we propose that the low-level sensory information close to the bottom-up input and the high-level semantic information of an emotional event are differentially regulated. To test this hypothesis, we performed an event-related functional magnetic resonance imaging (fMRI) experiment. During the scan, cortical activity was monitored while participants perceived or recalled emotional scenes as well as neutral scenes. The participants performed two different post-scan tests: the sentence test, which highlights high-level semantic information, and the image test, which emphasizes low-level visual information. Consistent with the hypothesis, we found that the participants showed better performance for emotional scenes than for neutral scenes in the sentence test, while the opposite tendency was found in the image test. Additionally, the worse performance for low-level visual information suggests the possibility that the representations of the visual cortex are modulated by top-down signals. Based on the fMRI data, we found that the representational similarity (RS) between the primary visual cortex (V1) and the dorsolateral prefrontal cortex (dlPFC) is greater during the encoding of emotional scenes than during the encoding of neutral ones. Furthermore, the RS between V1 and dlPFC reflected the emotional memory strength in the sentence test, but not in the image test. These results suggest that during the encoding of emotional scenes compared to neutral scenes, the representation of the visual cortex is more modulated by top-down signals, leading to better high-level semantic memory but to weaker low-level sensory memory.