Deep reinforcement learning based heuristic DNA sequence alignment algorithm심층 강화 학습 기반 휴리스틱 염기서열 정렬 알고리즘

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 89
  • Download : 0
Various methods have been developed to analyze the association between organisms and their genomic sequences. Among them, sequence alignment is most frequently used for comparative analysis of biological genomes. However, the traditional sequence alignment method is considerably complex in proportion to the product of the length of the sequences, and it is considerably challenging to align long sequences such as a human genome. Over the decades, there have been improvements in the sequence alignment algorithm, with significant advances in various aspects such as complexity, accuracy, or algorithms' diversity. However, human-defined algorithms have an explicit limitation in view of the development completeness. This thesis introduces a novel method to obtain optimal sequence alignment based on reinforcement learning. At first, we proposed the local best path selection model, which repeats optimal alignment in the small window at every step. However, the local best path selection model needs to solve the complexity problem because it repeats optimal alignment process many times. Here, we decided to adapt deep reinforcement learning to mimic the proposed heuristic sequence alignment algorithm. This deep reinforcement learning based sequence alignment algorithm, named as DQNalign, can select the next destination by observing at the current state only without aligning the subsequences in the window size. DQNalign determines immediately and proceeds the optimal alignment direction by using the sequence information within the window of the current alignment position. DQNalign shows superiority for dissimilar sequence pairs which have low identity values. Theoretically, we prove that the proposed DQNalign can achieve the same performance as the optimal alignment method with only linear complexity in case of a large window size. DQNalign was used to minimize human intervention and find the optimal path with only the given alignment score system. However, since previous DQNalign algorithm can only proceed global alignment, it was essential to find an appropriate starting point to complete the sequence alignment. This thesis proposes a novel local alignment method based on DQNalign algorithm. Besides, by providing adaptability in various environments using meta-learning, the proposed local alignment method based on DQNalign algorithm shows an appropriately optimized technique for different sequence sets. By deriving the window size to maintain high performance, we proved that the proposed method has an advantage in view of local alignment complexity. Also, we verified the complexity relations among various parameters through simulation in the actual genome sequence. Finally, we confirmed that the proposed local alignment method based on DQNalign algorithm has an advantage over the conventional method as the x-drop parameter increases. Through this study, it was possible to confirm the possibility of a new alignment algorithm that minimizes human intervention and has higher scalability.
Advisors
Cho, Donghoresearcher조동호researcher
Description
한국과학기술원 :전기및전자공학부,
Publisher
한국과학기술원
Issue Date
2021
Identifier
325007
Language
eng
Description

학위논문(박사) - 한국과학기술원 : 전기및전자공학부, 2021.2,[iv, 71 p. :]

Keywords

Pairwise sequence alignment▼aDeep reinforcement learning▼aMeta learning▼aModel of evolution▼aSequence comparison; 쌍염기서열 정렬▼a심층 강화 학습▼a메타 러닝▼a진화 모델▼a염기서열 비교

URI
http://hdl.handle.net/10203/295693
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=956685&flag=dissertation
Appears in Collection
EE-Theses_Ph.D.(박사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0