OptimStore: In-Storage Optimization of Large Scale DNNs with On-Die Processing

Cited 6 time in webofscience Cited 0 time in scopus
  • Hit : 50
  • Download : 0
Training deep neural network (DNN) models is a resource-intensive, iterative process. For this reason, nowadays, complex optimizers like Adam are widely adopted as it increases the speed and efficiency of training. These optimizers, however, employ additional variables and raise the memory demand 2× to 3× of model parameters, worsening the memory capacity bottleneck. Moreover, as the size of DNN models is projected to grow even further, it is not practical to assume that the future models will fit in accelerator memory. This has triggered various efforts to offload models to flash-based storage. However, when the model, especially the optimizer, is offloaded to flash, the limited I/O bandwidth severely slows down the overall training process. To this end, we present OptimStore, a solid-state drive (SSD) system with on-die processing (ODP) architectures for gradient descent-based machine learning models. OptimStore accelerates the training process of such large-scale models by processing model optimization in the storage device, specifically inside the flash dies. ODP capability of OptimStore eliminates the heavy data movement over external interconnect and internal flash channels. Overall, OptimStore achieves, on average, a 2.8× speedup and a 3.6× improved energy efficiency in the weight update stage over baseline SSD offloading.
Publisher
IEEE Computer Society
Issue Date
2023-02-28
Language
English
Citation

29th IEEE International Symposium on High-Performance Computer Architecture, HPCA 2023, pp.611 - 623

ISSN
1530-0897
DOI
10.1109/HPCA56546.2023.10071024
URI
http://hdl.handle.net/10203/314933
Appears in Collection
EE-Conference Papers(학술회의논문)
Files in This Item
There are no files associated with this item.
This item is cited by other documents in WoS
⊙ Detail Information in WoSⓡ Click to see webofscience_button
⊙ Cited 6 items in WoS Click to see citing articles in records_button

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0