DSpace at KOASAS: Projected variable three-term conjugate gradient algorithm for enhancing generalization performance in deep neural network training

DSpace at KOASAS

College of Engineering(공과대학)Cho Chun Shik Graduate School for Mobility(조천식모빌리티대학원)GT-Journal Papers(저널논문)

Projected variable three-term conjugate gradient algorithm for enhancing generalization performance in deep neural network training

Cited 0 time in webofscience

Cited 0 time in

Hit : 119
Download : 0

Export

DC Field	Value	Language
dc.contributor.author	Kim, Sanghyuk	ko
dc.contributor.author	Kim, Hansu	ko
dc.contributor.author	Kang, Namwoo	ko
dc.contributor.author	Lee, Tae Hee	ko
dc.date.accessioned	2025-10-17T07:00:10Z	-
dc.date.available	2025-10-17T07:00:10Z	-
dc.date.created	2025-10-17	-
dc.date.created	2025-10-17	-
dc.date.issued	2025-12	-
dc.identifier.citation	NEUROCOMPUTING, v.657	-
dc.identifier.issn	0925-2312	-
dc.identifier.uri	http://hdl.handle.net/10203/334601	-
dc.description.abstract	Deep learning optimization faces a fundamental trade-off between convergence efficiency and generalization. First-order methods such as stochastic gradient descent (SGD) and adaptive moment estimation (Adam) tend to find flatter minima but converge slowly, while higher-order methods converge rapidly but are often drawn to sharp minima that generalize poorly. To address this, we introduce the projected variable three-term conjugate gradient (PVTTCG) algorithm. Motivated by the geometric instabilities in modern networks that use techniques such as batch normalization (BN), PVTTCG integrates an orthogonal projection into the higher-order optimization framework. This mechanism eliminates radial components from the search direction, inherently guiding the optimization toward flatter regions without requiring additional regularization terms or hyperparameters. The effectiveness of PVTTCG is validated across diverse tasks, including language modeling, large-scale image classification, and a real-world engineering application. In complex scenarios, PVTTCG consistently improves upon its higher-order baseline, achieving up to a 3.92 percentage point gain on CIFAR-100 while remaining competitive with leading first-order methods. A systematic analysis reveals that PVTTCG demonstrates superior robustness to batch size variations, particularly excelling at larger batch sizes. This robustness enables the algorithm to process batch sizes up to 2,048 in engineering applications, achieving a 35.9% test loss reduction compared to Adam. These findings establish PVTTCG as an effective solution for bridging the convergencegeneralization trade-off.	-
dc.language	English	-
dc.publisher	ELSEVIER	-
dc.title	Projected variable three-term conjugate gradient algorithm for enhancing generalization performance in deep neural network training	-
dc.type	Article	-
dc.identifier.wosid	001578610400001	-
dc.identifier.scopusid	2-s2.0-105016527430	-
dc.type.rims	ART	-
dc.citation.volume	657	-
dc.citation.publicationname	NEUROCOMPUTING	-
dc.identifier.doi	10.1016/j.neucom.2025.131568	-
dc.contributor.localauthor	Kang, Namwoo	-
dc.contributor.nonIdAuthor	Kim, Hansu	-
dc.contributor.nonIdAuthor	Lee, Tae Hee	-
dc.description.isOpenAccess	N	-
dc.type.journalArticle	Article	-
dc.subject.keywordAuthor	Optimization algorithm	-
dc.subject.keywordAuthor	Generalization performance	-
dc.subject.keywordAuthor	Conjugate gradient method	-
dc.subject.keywordAuthor	Vehicle crashworthiness	-
dc.subject.keywordAuthor	Image classification	-
dc.subject.keywordAuthor	Language modeling	-

Appears in Collection: GT-Journal Papers(저널논문)

Files in This Item: There are no files associated with this item.

Display Simple Item Record

qr_code

트윗하기

KOASAS

Academic Information Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

Projected variable three-term conjugate gradient algorithm for enhancing generalization performance in deep neural network training

KOASAS

Communities & Collections