DSpace at KOASAS: Protein coding region identification

DSpace at KOASAS

College of Life Science and Bioengineering(생명과학기술대학)Dept. of Biological Sciences(생명과학과)BS-Theses_Ph.D.(박사논문)

Protein coding region identificationDNA 염기열에서 단백질 코딩 부위 검색 방법

Cited 0 time in webofscience

Cited 0 time in scopus

Hit : 543
Download : 0

Export

Weon, Se-Yeon / 원세연

A protein coding region is the region in a DNA sequence which results in the generation of a protein product. Protein coding region identification is the first thing usually done after the determination of a DNA sequence. Many different computer programs have been developed for this purpose and it is one of the major and productive fields of computational biology nowadays. Protein coding region databases for E. coli, primate, and S. cerevisiae were created from GenBank. Trimer frequencies for 64 trimers in 6 different phases (3 for each direction) were counted from these databases. An analysis of trimer frequencies in above three organisms were done. A new protein coding measure called TFD(trimer frequency difference) was devised by subtracting a trimer frequency in a phase by another phase. Among 30 possible combinations, 5 of them (subtracting phase 1, direction 0 by the other 5 phases) are selected to use as a protein coding measure. An analysis of TFDs of above three organisms was done and the quality of TFD as a protein coding measure was examined. A frequency fluctuation presenting method called NC(normalized cumulative) plot is devised. Different from sliding window method, NC plot shows frequency fluctuation as it is. Many different applications are possible with NC plot. By combining TFD and NC plot, a new computer program for protein coding region identification called DNAClimber was devised. In the case of E. coli, 96.4% of 319 test protein coding regions can be found using DNAClimber. For S. cerevisiae, 93.5% of 371 test protein coding regions were found using DNAClimber. The so-called antisense symmetry problem of protein coding region identification methods is overcome in DNAClimber by using $TFD_5$. Another usage of DNAClimber is detecting sequencing errors. Since the current method of DNA sequence determination is error prone, it is important to have a tool for detecting sequencing

Advisors: Kang, Chang-Won researcher; 강창원 researcher

Description: 한국과학기술원 : 생물과학과,

Publisher: 한국과학기술원

Issue Date: 1995

Identifier: 101841/325007 / 000905815

Language: eng

Description: 학위논문(박사) - 한국과학기술원 : 생물과학과, 1995.8, [ vii, 110 p. ]

Keywords: Protein Coding Region; Sequence Analysis; Codon Usage; 코돈 사용빈도; 단백질 코딩 부위; 염기열 분석

URI: http://hdl.handle.net/10203/27372

Link: http://library.kaist.ac.kr/search/detail/view.do?bibCtrlNo=101841&flag=dissertation

Appears in Collection: BS-Theses_Ph.D.(박사논문)

Files in This Item: There are no files associated with this item.

Display Full Item Record

qr_code

트윗하기

KOASAS

Knowledge Service Development Team, KAIST 291 Daehak-ro, Yuseong-gu, Daejeon 34141, Republic of Korea. T. 82-42-350-4493 Email. koasas@kaist.ac.kr
Copyright © 2016. Korea Advanced Institute of Science and Technology. All Rights Reserved.

KOASAS

KOASAS

Browse

Protein coding region identificationDNA 염기열에서 단백질 코딩 부위 검색 방법

KOASAS

Communities & Collections