DSpace at KOASAS: AdaBlock: SGD with Practical Block Diagonal Matrix Adaptation for Deep Learning

DSpace at KOASAS

College of Engineering(공과대학)Kim Jaechul Graduate School of AI(김재철AI대학원)AI-Conference Papers(학술대회논문)

AdaBlock: SGD with Practical Block Diagonal Matrix Adaptation for Deep Learning

Cited 0 time in webofscience

Cited 0 time in scopus

Hit : 84
Download : 0

Export

DC Field	Value	Language
dc.contributor.author	Yun, Jihun	ko
dc.contributor.author	Lozano, Aurelie C.	ko
dc.contributor.author	Yang, Eunho	ko
dc.date.accessioned	2022-09-05T01:00:49Z	-
dc.date.available	2022-09-05T01:00:49Z	-
dc.date.created	2022-09-01	-
dc.date.issued	2022-03	-
dc.identifier.citation	International Conference on Artificial Intelligence and Statistics	-
dc.identifier.issn	2640-3498	-
dc.identifier.uri	http://hdl.handle.net/10203/298302	-
dc.description.abstract	We introduce ADABLOCK, a class of adaptive gradient methods that extends popular approaches such as ADAM by adopting the simple and natural idea of using block-diagonal matrix adaption to effectively utilize structural characteristics of deep learning architectures. Unlike other quadratic or blockdiagonal approaches, ADABLOCK has complete freedom to select block-diagonal groups, providing a wider trade-off applicable even to extremely high-dimensional problems. We provide convergence and generalization error bounds for ADABLOCK, and study both theoretically and empirically the impact of the block size on the bounds and advantages over usual diagonal approaches. In addition, we propose a randomized layer-wise variant of ADABLOCK to further reduce computations and memory footprint, and devise an efficient spectrum-clipping scheme for ADABLOCK to benefit from SGD's superior generalization performance. Extensive experiments on several deep learning tasks demonstrate the benefits of block diagonal adaptation compared to adaptive diagonal methods, vanilla SGD, as well as modified versions of full-matrix adaptation.	-
dc.language	English	-
dc.publisher	JMLR-JOURNAL MACHINE LEARNING RESEARCH	-
dc.title	AdaBlock: SGD with Practical Block Diagonal Matrix Adaptation for Deep Learning	-
dc.type	Conference	-
dc.identifier.wosid	000828072702028	-
dc.type.rims	CONF	-
dc.citation.publicationname	International Conference on Artificial Intelligence and Statistics	-
dc.identifier.conferencecountry	US	-
dc.identifier.conferencelocation	ELECTR NETWORK	-
dc.contributor.localauthor	Yang, Eunho	-
dc.contributor.nonIdAuthor	Lozano, Aurelie C.	-

Appears in Collection: AI-Conference Papers(학술대회논문)

Files in This Item: There are no files associated with this item.

Display Simple Item Record

qr_code

트윗하기

KOASAS

Knowledge Service Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

AdaBlock: SGD with Practical Block Diagonal Matrix Adaptation for Deep Learning

KOASAS

Communities & Collections