Unsupervised domain adaptation in neural machine translation기계 번역 모델에서의 비지도 학습 방법

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 166
  • Download : 0
Domain mismatch between the training and test data is a well-known challenge in neural machine translation and leads to performance degradation. As the domain shifted, unknown words come out, certain words represent different meanings, and word co-occurrence statistics are changed. We focus on the problems mentioned earlier on the source language side and suggest two unsupervised domain adaptation methods that utilize an additional source language monolingual data. First, we propose a Joint Masked Sequence to Sequence (JMSS) model that shares the parameter of the conditional masked language model’s encoder and masked language model’s encoder. JMSS exploits a masked language model, and it ensures that the latent representation of the source sentence becomes robust to the source language. Next, we introduce the Sequence Margin Disparity Discrepancy (SMDD), a conditional masked language model with an auxiliary classifier for learning domain invariant representation using adversarial training. SMDD attempts to extend the unsupervised domain adaptation algorithm previously limited to classification problem to sequential problem. We also suggest the model selection method in domain adaptation in neural machine translation. We conduct domain adaptation experiments in five domains and demonstrate performance improvement in the domain adaptation tasks. Finally, we show that we can surpass Domain Adaptation by Lexicon Induction (DALI) using only source-side monolingual data.
Advisors
Oh, Aliceresearcher오혜연researcher
Description
한국과학기술원 :전산학부,
Publisher
한국과학기술원
Issue Date
2021
Identifier
325007
Language
eng
Description

학위논문(석사) - 한국과학기술원 : 전산학부, 2021.2,[iv, 27 p. :]

Keywords

Unsupervised domain adaptation▼amachine translation▼acommon word▼aNon-autoregressive machine translation▼asequence generation; 비지도 도메인 적응학습▼a기계 번역▼a공통된 단어▼a비회귀 기계번역▼a시퀀스 생성

URI
http://hdl.handle.net/10203/296134
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=948461&flag=dissertation
Appears in Collection
CS-Theses_Master(석사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0