A Deep Learning Approach for Detecting Copy Number Variation in Next-Generation Sequencing Data

View/ Open
Issue Date
2019-08-27Author
Hill, Tom
Unckless, Robert L.
Publisher
Genetics Society of America
Type
Article
Article Version
Scholarly/refereed, author accepted manuscript
Rights
© 2019 Hill, Unckless.
Metadata
Show full item recordAbstract
Copy number variants (CNV) are associated with phenotypic variation in several species. However, properly detecting changes in copy numbers of sequences remains a difficult problem, especially in lower quality or lower coverage next-generation sequencing data. Here, inspired by recent applications of machine learning in genomics, we describe a method to detect duplications and deletions in short-read sequencing data. In low coverage data, machine learning appears to be more powerful in the detection of CNVs than the gold-standard methods of coverage estimation alone, and of equal power in high coverage data. We also demonstrate how replicating training sets allows a more precise detection of CNVs, even identifying novel CNVs in two genomes previously surveyed thoroughly for CNVs using long read data.
Description
This work is licensed under a Creative Commons Attribution 4.0 International License.
Collections
Citation
Hill, T., & Unckless, R. L. (2019). A Deep Learning Approach for Detecting Copy Number Variation in Next-Generation Sequencing Data. G3 (Bethesda, Md.), 9(11), 3575–3582. https://doi.org/10.1534/g3.119.400596
Items in KU ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.
We want to hear from you! Please share your stories about how Open Access to this item benefits YOU.