Out-of-core de novo assembly framework for large scale genome대규모 유전자를 위한 아웃-오브-코어 디 노보 어셈블리 프레임워크

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 123
  • Download : 0
De novo assembly is a vital process in modern genomics that recovers the genomes from short DNA fragments. For the recovery, it first constructs the assembly graph that represents the overlap between different fragments and then concatenates the overlapped fragments by traversing the assembly graph. However, the input genomic datasets for de novo assembly often include a large genome or thousands of small genomes, making the corresponding assembly graph at a massive scale. To this end, this dissertation proposes OC-assembler, a de novo assembly framework that stores the assembly graph in storage and processes the graph in an out-of-core fashion. OC-assembler introduces a new graph data structure that reduces the memory footprint by considering the double-stranded structure of an input genome, thereby significantly reducing the storage I/O. Further, OC-assembler efficiently handles the iterative graph update of the de novo assembly by storing a small update log instead of overwriting the entire graph in storage when the amount of updates is small. Our evaluation on diverse genome datasets shows that OC-assembler reduces the storage I/O by 2.9×, thereby shortening the execution time by 2.1× compared to existing out-of-core graph processing methods.
Advisors
Jung, Myoungsooresearcher정명수researcher
Description
한국과학기술원 :전기및전자공학부,
Publisher
한국과학기술원
Issue Date
2022
Identifier
325007
Language
eng
Description

학위논문(석사) - 한국과학기술원 : 전기및전자공학부, 2022.8,[iii, 23 p. :]

Keywords

De novo assembly▼aDNA sequencing▼aBioinformatics▼aGraph partitioning; 디 노보 어셈블리▼a유전자 시퀀싱▼a바이오인포매틱스▼a그래프 파티셔닝

URI
http://hdl.handle.net/10203/310033
Link
http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=1008354&flag=dissertation
Appears in Collection
EE-Theses_Master(석사논문)
Files in This Item
There are no files associated with this item.

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0