DSpace at KOASAS: (An) empirical study on DNN inference : Adaptive degree assignation schema based on pruning sensitivity model

DSpace at KOASAS

College of Engineering(공과대학)School of Electrical Engineering(전기및전자공학부)EE-Theses_Master(석사논문)

(An) empirical study on DNN inference : Adaptive degree assignation schema based on pruning sensitivity model심층 신경망 추론에 대한 실증적 연구 : 프루닝 민감도 모형 기반 적응적 정도 할당 기법

Cited 0 time in webofscience

Cited 0 time in scopus

Hit : 349
Download : 0

Export

Loaiciga Rodriguez, Jorge Mario

The need for reduction of computation resources consumed by deep learning applications has motivated researches to introduce several techniques that modify pre-trained DNNs. Pruning is one of said techniques, and is particularly suitable for modifying model which will be executed on GPUs and common DL frameworks. Since the introduction of pruning, a grate number of pruning approaches have been developed, each better than the previous one at reducing the inference computations cost while loosing smaller percentages of the original model’s accuracy. However, not much has been said about the degree to which layers of a DNN should be pruned, a particularly tricky task since it has been proven that different layers present a distinct sensitivity to pruning. The aim of this work is to study, experiment with and improve the means by which a proper pruning degree can be determined. To this end, we introduce a mathematical model for pruning sensitivity, as well as a schema for generating pruning degree assignations specific to each f the model’s layers based on their pruning sensitivity characteristic. Subsequent to this, we explore the usefulness of our proposed schema on a variety of models and datasets, providing thus a holistic view of the potential benefits of pruning. We performed a side-by-side comparison between models pruned according to our schema and models pruned according to literature-based pruning degree assignation. In terms of performance and accuracy, our approach allowed us to prune models which have less than 1% accuracy drop in comparison with those pruned according to the literature, while achieving from 17% to 22% more compute cost reduction. Additionally, we pruned a variety of models and achieved more than 30% compute cost reduction without loosing much more than 2.3% accuracy in the majority of cases. Such benefits were obtained thanks to the mathematical base used for modeling the pruning sensitivity of DNN layers. Said model allowed us to reduce considerably extra weights from robust layers while preserving more weights on the sensitive ones, thus achieving a good compromise between inference performance and accuracy.

Advisors: Youn, Chan-Hyun researcher; 윤찬현 researcher

Description: 한국과학기술원 :전기및전자공학부,

Publisher: 한국과학기술원

Issue Date: 2018

Identifier: 325007

Language: eng

Description: 학위논문(석사) - 한국과학기술원 : 전기및전자공학부, 2018.2,[iii, 48 p. :]

Keywords: Interference▼aInterference Serving▼aNeural Network Pruning▼aPruning Criteria▼aPruning Sensitivity▼aPruning Degree

URI: http://hdl.handle.net/10203/266873

Link: http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=733975&flag=dissertation

Appears in Collection: EE-Theses_Master(석사논문)

Files in This Item: There are no files associated with this item.

Display Full Item Record

qr_code

트윗하기

KOASAS

Knowledge Service Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

(An) empirical study on DNN inference : Adaptive degree assignation schema based on pruning sensitivity model심층 신경망 추론에 대한 실증적 연구 : 프루닝 민감도 모형 기반 적응적 정도 할당 기법

KOASAS

Communities & Collections