DSpace at KOASAS: Ultra-High Dimensional Sparse Representations with Binarization for Efficient Text Retrieval

DSpace at KOASAS

College of Engineering(공과대학)School of Computing(전산학부)CS-Conference Papers(학술회의논문)

Ultra-High Dimensional Sparse Representations with Binarization for Efficient Text Retrieval

Cited 8 time in

Cited 0 time in scopus

Hit : 419
Download : 0

Export

Jang, Kyoung-Rok / Kang, Junmo / Hong, Giwon / Myaeng, Sung-Hyon researcher / Park, Joohee / Yoon, Taewon / Seo, Heecheol

The semantic matching capabilities of neural information retrieval can ameliorate synonymy and polysemy problems of symbolic approaches. However, neural models’ dense representations are more suitable for re-ranking, due to their inefficiency. Sparse representations, either in symbolic or latent form, are more efficient with an inverted index. Taking the merits of the sparse and dense representations, we propose an ultra-high dimensional (UHD) representation scheme equipped with directly controllable sparsity. UHD’s large capacity and minimal noise and interference among the dimensions allow for binarized representations, which are highly efficient for storage and search. Also proposed is a bucketing method, where the embeddings from multiple layers of BERT are selected/merged to represent diverse linguistic aspects. We test our models with MS MARCO and TREC CAR, showing that our models outperforms other sparse models.

Publisher: Association for Computational Linguistics

Issue Date: 2021-11-07

Language: English

Citation: Conference on Empirical Methods in Natural Language Processing (EMNLP), pp.1016 - 1029

URI: http://hdl.handle.net/10203/289392

Appears in Collection: CS-Conference Papers(학술회의논문)

Files in This Item: There are no files associated with this item.

This item is cited by other documents in WoS

⊙ Detail Information in WoSⓡ	Click to see
⊙ Cited 8 items in WoS	Click to see citing articles in

Display Full Item Record

qr_code

트윗하기

KOASAS

Knowledge Service Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

Ultra-High Dimensional Sparse Representations with Binarization for Efficient Text Retrieval

This item is cited by other documents in WoS

KOASAS

Communities & Collections