Capturing ambiguity in natural language understanding tasks with information from internal layers내부 계층의 정보를 이용한 자연어 이해 태스크에서의 애매성 포착

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 30
  • Download : 0
In natural language understanding (NLU) tasks, there are a large number of ambiguous samples where veracity of their labels is debatable among annotators. Recently, researchers have found that even when additional annotators evaluate such ambiguous samples, they tend not to converge to single gold labels. It has been also revealed that, even when they are assessed by different groups of annotators, the degree of ambiguity is similarly reproduced. Therefore, it is desirable for a model used in an NLU task not only to predict a label that is likely to be considered correct by multiple annotators for a given sample but also to provide information about the ambiguity, indicating whether other labels could also be correct. This becomes particularly crucial in situations where the outcomes of decision-making can lead to serious problems, as information about ambiguity can guide users to make more cautious decisions and avoid risks. In this dissertation, we discuss methods for capturing ambiguous samples in NLU tasks. Due to the inherent ambiguity in NLU tasks, numerous samples with different labels can exist among those that share similar features. Therefore, it is highly likely that the model has learned information within its internal layers about which features are associated with various labels, and consequently, whether or not they exhibit ambiguity. Based on this assumption, our investigation of the representations for samples at each internal layer has revealed that information about the ambiguity of samples is more accurately represented in lower layers. Specifically, in lower layers, ambiguous samples are represented closely to samples with relevant labels in their embedding space. However, this tendency is no longer observed in the higher layers. Based on these observations, we propose methods for capturing ambiguous samples using the distribution or representation information from lower layers of encoder-based pre-trained language models (PLMs) or decoder-based large language models (LLMs). Recently, these two types of models have been predominantly used for NLU tasks. More specifically, we introduce various approaches, including using layer pruning that removes upper layers close to the output layer to utilize information from lower layers, knowledge distillation that distills distribution knowledge from lower layers, and methods utilizing internal representations from lower layers. Through experiments with NLU datasets from various domains and tasks, we demonstrate that information from internal layers, particularly from lower layers, is valuable for capturing the ambiguity of samples. We also show that our proposed methods, which use the information from lower layers, significantly outperform existing methods.
Advisors
박종철researcher
Description
한국과학기술원 :전산학부,
Publisher
한국과학기술원
Issue Date
2024
Identifier
325007
Language
eng
Description

학위논문(박사) - 한국과학기술원 : 전산학부, 2024.2,[v, 52 p. :]

Keywords

자연어 이해▼a애매성▼a내부 계층▼a레이어 프루닝▼a지식 증류▼a대규모 언어 모델; natural language understanding▼aambiguity▼ainternal layer▼alayer pruning▼aknowledge distillation▼alarge language model

URI
http://hdl.handle.net/10203/322197
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=1100106&flag=dissertation
Appears in Collection
CS-Theses_Ph.D.(박사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0