Enhancing deep neural networks with feature attribution : Scaling from edge models to foundation models특징 기여도 분석을 통한 심층신경망 성능 향상 : 엣지 모델에서 거대 기반 모델까지

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 171
  • Download : 0
Feature attribution is an effective technique in interpreting deep neural networks across a wide range of scales, from compressed convolutional neural networks on edge devices to cloud-based large language models. While feature attribution methods are traditionally used for explanation, this dissertation explores the potential of feature attribution for identifying and addressing model vulnerabilities, ultimately improving model performance and reliability. The study progressively expands its model scale, deriving appropriate techniques for each scale. At the smallest scale, this dissertation identifies that while prediction accuracy is preserved during model compression, attribution map integrity is significantly compromised. To address this, we propose an attribution map matching loss function that simultaneously enhances both model explainability and accuracy. For mid-scale applications, we develop a novel tabular data oversampling framework using transformer-based language models. This approach leverages column-wise self-attention attribution to identify class-representative features, enabling targeted column imputation for generating high-quality synthetic samples. The dissertation then expands to large-scale language models, exploring both input and output token-level attribution. From the input perspective, we propose an attribution-guided key-value cache compression methodology that identifies and differentially preserves crucial information, optimizing memory usage while maintaining model performance. From the output perspective, we develop a token-level importance-based verifier that enhances mathematical reasoning capabilities in large language models through fine-grained supervision signals.
Advisors
Yang, Eunhoresearcher양은호researcher
Description
한국과학기술원 :김재철AI대학원,
Publisher
한국과학기술원
Issue Date
2025
Identifier
325007
Language
eng
Description

학위논문(박사) - 한국과학기술원 : 김재철AI대학원, 2025.2,[xi, 112 p. :]

Keywords

Deep Learning; Feature Attribution; Convolutional Neural Network; Large Language Models; 심층 학습; 특징 기여도 분석; 합성곱 신경망; 대형 언어 모델

URI
http://hdl.handle.net/10203/332271
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=1120489&flag=dissertation
Appears in Collection
AI-Theses_Ph.D.(박사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0