Survey on Nucleotide Encoding Techniques and SVM Kernel Design for Human Splice Site Prediction

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 286
  • Download : 589
Splice site prediction in DNA sequence is a basic search problem for finding exon/intron and intron/exon boundaries. Removing introns and then joining the exons together forms the mRNA sequence. These sequences are the input of the translation process. It is a necessary step in the central dogma of molecular biology. The main task of splice site prediction is to find out the exact GT and AG ended sequences. Then it identifies the true and false GT and AG ended sequences among those candidate sequences. In this paper, we survey research works on splice site prediction based on support vector machine (SVM). The basic difference between these research works is nucleotide encoding technique and SVM kernel selection. Some methods encode the DNA sequence in a sparse way whereas others encode in a probabilistic manner. The encoded sequences serve as input of SVM. The task of SVM is to classify them using its learning model. The accuracy of classification largely depends on the proper kernel selection for sequence data as well as a selection of kernel parameter. We observe each encoding technique and classify them according to their similarity. Then we discuss about kernel and their parameter selection. Our survey paper provides a basic understanding of encoding approaches and proper kernel selection of SVM for splice site prediction.
Publisher
Interdisciplinary Bio Central (IBC)
Issue Date
2012-04
Language
English
Citation

Interdisciplinary Bio Central, v.4, no.14, pp.1 - 6

ISSN
2005-8543
DOI
10.4051/ ibc.2012.4.4.0014
URI
http://hdl.handle.net/10203/214046
Appears in Collection
CS-Journal Papers(저널논문)
Files in This Item
95580.pdf(486.51 kB)Download

qr_code

  • mendeley

    citeulike


rss_1.0 rss_2.0 atom_1.0