ES-MoE: overcoming the scalability challenges in mixture-of-experts models전문가 혼합 모델의 확장성 문제 극복 연구

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 3
  • Download : 0
DC FieldValueLanguage
dc.contributor.advisor한동수-
dc.contributor.authorKim, Yechan-
dc.contributor.author김예찬-
dc.date.accessioned2024-07-25T19:30:44Z-
dc.date.available2024-07-25T19:30:44Z-
dc.date.issued2023-
dc.identifier.urihttp://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=1045718&flag=dissertationen_US
dc.identifier.urihttp://hdl.handle.net/10203/320530-
dc.description학위논문(석사) - 한국과학기술원 : 김재철AI대학원, 2023.8,[iii, 26 p. :]-
dc.description.abstractMixture-of-Experts (MoE) models have recently emerged as a powerful technique for enhancing the scalability and performance of neural networks, primarily by leveraging learnable gating networks to allocate input tokens to different expert models. However, training MoE models on GPUs presents unique challenges, including insufficient GPU memory capacity for a large number of experts and computational inefficiency due to token load imbalance. To address these issues, we introduce Expert Server MoE (ES-MoE), a novel method that offloads all expert parameters and their optimizer states to CPUs. This approach not only mitigates the memory constraints of GPU-based training but also enhances training throughput by creating a unified pool of experts that allows for more efficient scheduling. Furthermore, ES-MoE employs pipelined expert optimization to minimize the iteration latency, effectively circumventing the issue of extended CPU optimization time. We validate our approach using GPT-based MoE architectures, demonstrating that ES-MoE scales up to 16 times better, and improves throughput up to 4.55x over the existing frameworks.-
dc.languageeng-
dc.publisher한국과학기술원-
dc.subject전문가 혼합 모델 시스템▼a머신 러닝 시스템▼a메모리 개선▼a학습속도 가속화▼a파이프라이닝-
dc.subjectMixture-of-experts system▼aMachine learning system▼aMemory improvements▼aAccelerate training▼aPipelining-
dc.titleES-MoE: overcoming the scalability challenges in mixture-of-experts models-
dc.title.alternative전문가 혼합 모델의 확장성 문제 극복 연구-
dc.typeThesis(Master)-
dc.identifier.CNRN325007-
dc.description.department한국과학기술원 :김재철AI대학원,-
dc.contributor.alternativeauthorHan, Dongsu-
Appears in Collection
AI-Theses_Master(석사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0