Showing results 1 to 4 of 4
LazyBatching: An SLA-aware Batching System for Cloud Machine Learning Inference Choi, Yujeong; Kim, Yunseong; Rhu, Minsoo, The 27th IEEE International Symposium on High-Performance Computer Architecture (HPCA-27), pp.493 - 506, IEEE Computer Society, 2021-03-02 |
NeuMMU: Architectural Support for Efficient Address Translations in Neural Processing Units Hyun, Bongjoon; Kwon, Youngeun; Choi, Yujeong; Kim, John Dongjun; Rhu, Minsoo, The 25th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-25), pp.1109 - 1124, ACM, 2020-03-20 |
PARIS and ELSA: An Elastic Scheduling Algorithm for Reconfigurable Multi-GPU Inference Servers Kim, Yunseong; Choi, Yujeong; Rhu, Minsoo, 59th ACM/IEEE Design Automation Conference, DAC 2022, pp.607 - 612, ACM/IEEE/ESDA, 2022-06-10 |
PREMA: A Predictive Multi-task Scheduling Algorithm For Preemptible Neural Processing Units Choi, Yujeong; Rhu, Minsoo, 26th IEEE International Symposium on High Performance Computer Architecture, HPCA 2020, pp.220 - 233, IEEE, 2020-02-24 |
Discover