Computational method of database construction for genetic variant calling

Cited 0 time in webofscience Cited 0 time in scopus
  • Hit : 74
  • Download : 0
In this study, we examined the impact of the variant database in recalibration and developed a database-generation model that gathers potential candidates directly from resequencing genome data. Based on human genome data, we optimize the hyper-parameters in the model and evaluate the performance improvements both in terms of recalibration and variant calling. To test whether our pseudo-database approach is applicable to species other than human, we constructed pseudo-databases for sheep, rice, and chickpea, and compared its performance with dbSNP. Consistently, we find that our pseudo-database provides improved recalibration and error rates. More importantly, the use of pseudo-databases led to the identification of additional genetic variants. Therefore, the reanalysis with our pseudo-databases approach effectively recalibrates the base quality scores and consequently uncovers hidden genetic variations in published resequencing data.
Institute of Electrical and Electronics Engineers Inc.
Issue Date

2022 IEEE International Conference on Big Data, Big Data 2022, pp.6696 - 6698

Appears in Collection
BiS-Conference Papers(학술회의논문)
Files in This Item
There are no files associated with this item.


  • mendeley


rss_1.0 rss_2.0 atom_1.0