Loading...
Thumbnail Image
Publication

A Deep Learning Approach for Detecting Copy Number Variation in Next-Generation Sequencing Data

Hill, Tom
Unckless, Robert L.
Citations
Altmetric:
Abstract
Copy number variants (CNV) are associated with phenotypic variation in several species. However, properly detecting changes in copy numbers of sequences remains a difficult problem, especially in lower quality or lower coverage next-generation sequencing data. Here, inspired by recent applications of machine learning in genomics, we describe a method to detect duplications and deletions in short-read sequencing data. In low coverage data, machine learning appears to be more powerful in the detection of CNVs than the gold-standard methods of coverage estimation alone, and of equal power in high coverage data. We also demonstrate how replicating training sets allows a more precise detection of CNVs, even identifying novel CNVs in two genomes previously surveyed thoroughly for CNVs using long read data.
Description
This work is licensed under a Creative Commons Attribution 4.0 International License.
Date
2019-08-27
Journal Title
Journal ISSN
Volume Title
Publisher
Genetics Society of America
Research Projects
Organizational Units
Journal Issue
Keywords
Coverage, Deletion, Duplication, Machine-learning, Next-generation sequencing
Citation
Hill, T., & Unckless, R. L. (2019). A Deep Learning Approach for Detecting Copy Number Variation in Next-Generation Sequencing Data. G3 (Bethesda, Md.), 9(11), 3575–3582. https://doi.org/10.1534/g3.119.400596
Embedded videos