A comparison of three variant calling pipelines using simulated data
Authors
DOI: https://doi.org/10.15625/2615-9023/16006Keywords:
Bcftools, GATK, Simulated data, Variant calling, VarScan, DwgsimReferences
DePristo M. A., Banks E., Poplin R., Garimella K. V., Maguire J. R., Hartl C., Philippakis A. A., del Angel G., Rivas M. A., Hanna M., McKenna A., Fennell T. J., Kernytsky A. M., Sivachenko A. Y., Cibulskis K., Gabriel S. B., Altshuler D., Daly M. J., 2011. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat. Genet., 43: 491–498.
Ewing B., Green P., 1998. Base-calling of automated sequencer traces using phred. II. Error probabilities. Genome Res., 8: 186–194.
Ewing B., Hillier L., Wendl M. C., Green P., 1998. Base-calling of automated sequencer traces using phred. I. Accuracy assessment. Genome Res., 8: 175–185.
Iqbal Z., Caccamo M., Turner I., Flicek P., McVean G., 2012. De novo assembly and genotyping of variants using colored de Bruijn graphs. Nat. Genet., 44: 226–232.
Koboldt D. C., Chen K., Wylie T., Larson D. E., McLellan M. D., Mardis E. R., Weinstock G. M., Wilson R. K., Ding L., 2009. VarScan: variant detection in massively parallel sequencing of individual and pooled samples. Bioinforma. Oxf. Engl., 25: 2283–2285.
Koboldt D. C., Larson D. E., Wilson R. K., 2013. Using VarScan 2 for Germline Variant Calling and Somatic Mutation Detection. Curr. Protoc. Bioinforma. Ed. Board Andreas Baxevanis Al 44: 15.4.1-15.4.17.
Langmead B., Salzberg S. L., 2012. Fast gapped-read alignment with Bowtie 2. Nat. Methods, 9: 357–359.
Li H., 2014. Toward better understanding of artifacts in variant calling from high-coverage samples. Bioinforma. Oxf. Engl., 30: 2843–2851.
Li H., 2013. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. ArXiv13033997 Q-Bio.
Li H., 2012. Exploring single-sample SNP and INDEL calling with whole-genome de novo assembly. Bioinforma. Oxf. Engl., 28: 1838–1844.
Li H., Durbin, R., 2009. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics, 25: 1754–1760.
Li H., Handsaker B., Wysoker A., Fennell T., Ruan J., Homer N., Marth G., Abecasis G., Durbin R., 1000 Genome Project Data Processing Subgroup, 2009. The Sequence Alignment/Map format and SAMtools. Bioinforma. Oxf. Engl., 25: 2078–2079.
Li R., Yu C., Li Y., Lam T. W., Yiu S.
M., Kristiansen K., Wang J., 2009. SOAP2: an improved ultrafast tool for short read alignment. Bioinforma. Oxf. Engl., 25: 1966–1967.
Meyer L. R., Zweig A. S., Hinrichs A. S., Karolchik D., Kuhn R. M., Wong M., Sloan C. A., Rosenbloom K. R., Roe G., Rhead B., Raney B. J., Pohl A., Malladi V. S., Li C. H., Lee B. T., Learned K., Kirkup V., Hsu F., Heitner S., Harte R. A., Haeussler M., Guruvadoo L., Goldman M., Giardine B. M., Fujita P. A., Dreszer T. R., Diekhans M., Cline M. S., Clawson H., Barber G. P., Haussler D., Kent W. J., 2013. The UCSC Genome Browser database: extensions and updates 2013. Nucleic Acids Res., 41: D64–D69.
Narasimhan V., Danecek P., Scally A., Xue Y., Tyler-Smith C., Durbin R., 2016. BCFtools/RoH: a hidden Markov model approach for detecting autozygosity from next-generation sequencing data. Bioinforma. Oxf. Engl., 32: 1749–1751.
Song K., Li L., Zhang G., 2016. Coverage recommendation for genotyping analysis of highly heterologous species using next-generation sequencing technology. Sci. Rep., 6: 35736.
Sudmant P. H., Rausch T., Gardner E. J., Handsaker R. E., Abyzov A., Huddleston J., Zhang Y., Ye K., Jun G., Hsi-Yang Fritz M., Konkel M. K., Malhotra A., Stütz A. M., Shi X., Paolo Casale F., Chen J., Hormozdiari F., Dayama G., Chen K., Malig M., Chaisson M. J. P., Walter K., Meiers S., Kashin S., Garrison E., Auton A., Lam H. Y. K., Jasmine Mu X., Alkan C., Antaki D., Bae T., Cerveira E., Chines P., Chong Z., Clarke L., Dal E., Ding L., Emery S., Fan X., Gujral M., Kahveci F., Kidd J. M., Kong Y., Lameijer E. W., McCarthy S., Flicek P., Gibbs R. A., Marth G., Mason C. E., Menelaou A., Muzny D. M., Nelson B. J., Noor A., Parrish N. F., Pendleton M., Quitadamo A., Raeder B., Schadt E. E., Romanovitch M., Schlattl A., Sebra R., Shabalin A. A., Untergasser A., Walker J. A., Wang M., Yu F., Zhang C., Zhang J., Zheng-Bradley X., Zhou W., Zichner T., Sebat J., Batzer M. A., McCarroll S. A., The 1000 Genomes Project Consortium, Mills R. E., Gerstein M. B., Bashir A., Stegle O., Devine S. E., Lee C., Eichler E. E., Korbel J. O., 2015. An integrated map of structural variation in 2,504 human genomes. Nature, 526: 75–81.
Tian S., Yan H., Neuhauser C., Slager S. L., 2016. An analytical workflow for accurate variant discovery in highly divergent regions. BMC Genomics, 17(1): 703.
Van der Auwera G. A., Carneiro M. O., Hartl C., Poplin R., Del Angel G., Levy-Moonshine A., Jordan T., Shakir K., Roazen D., Thibault J., Banks E., Garimella K. V., Altshuler D., Gabriel S., DePristo M. A., 2013. From FastQ data to high confidence variant calls: the genome analysis Toolkit best practices pipeline. Curr. Protoc. Bioinforma., 43: 11.10.1–11.10.33.
Weisenfeld N. I., Yin S., Sharpe T., Lau B., Hegarty R., Holmes L., Sogoloff B., Tabbaa D., Williams L., Russ C., Nusbaum C., Lander E. S., MacCallum I., Jaffe D. B., 2014. Comprehensive variation discovery in single human genomes. Nat. Genet., 46: 1350–1355.
Wu L., Yavas G., Hong H., Tong W., Xiao W., 2017. Direct comparison of performance of single nucleotide variant calling in human genome with alignment-based and assembly-based approaches. Sci. Rep., 7: 10963.
Downloads
Metrics
Downloads
PDF Downloaded: 141