DC Field | Value | Language |
---|---|---|
dc.contributor.advisor | Myaeng, Sung-Hyon | - |
dc.contributor.advisor | 맹성현 | - |
dc.contributor.author | Jang, Kyoung-Rok | - |
dc.date.accessioned | 2023-06-23T19:34:29Z | - |
dc.date.available | 2023-06-23T19:34:29Z | - |
dc.date.issued | 2022 | - |
dc.identifier.uri | http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=996352&flag=dissertation | en_US |
dc.identifier.uri | http://hdl.handle.net/10203/309237 | - |
dc.description | 학위논문(박사) - 한국과학기술원 : 전산학부, 2022.2,[v, 62 p. :] | - |
dc.description.abstract | Deep learning-based models generally use low-dimensional dense representations to express data samples. Although compact and powerful, it bears several shortcomings that make it unsuitable for tasks requiring processing a large number of samples (e.g., searching documents from web-scale corpus). More specifically, since each dimension of low-dimensional dense representations is highly entangled because of the limited number of dimensions available, it is susceptible to false matches when the number of samples is large. Also, all the dimensions must participate in representing and comparing samples regardless of each sample's characteristics, which is inefficient. Lastly, it is usually hard to interpret the entangled dimensions of dense representations. This thesis shows how high-dimensional sparse representations can cope with such problems in the field of natural language processing (NLP). We first explain the theoretical background and properties of high-dimensional sparse representations. Then we show how high-dimensionality and sparseness allow us to kill two birds, the performance and efficiency when applied to information retrieval (IR) and question answering (QA), the NLP tasks that require accurately finding relevant documents or answers from a vast amount of corpus with low latency. Finally, we introduce a method to interpret the model's outcome in quantitative and qualitative ways. | - |
dc.language | eng | - |
dc.publisher | 한국과학기술원 | - |
dc.title | Theory and application of ultra high dimensional sparse representations for efficient and interpretable semantic search | - |
dc.title.alternative | 효율적이고 해석 가능한 의미 검색을 위한 초고차원 희소 표상의 이론과 응용 | - |
dc.type | Thesis(Ph.D) | - |
dc.identifier.CNRN | 325007 | - |
dc.description.department | 한국과학기술원 :전산학부, | - |
dc.contributor.alternativeauthor | 장경록 | - |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.