Show simple item record

dc.contributor.authorZhong, Cuncong
dc.contributor.authorYang, Youngik
dc.contributor.authorYooseph, Shibu
dc.identifier.citationZhong, C., Yang, Y., & Yooseph, S. (2019). GRASP2: fast and memory-efficient gene-centric assembly and homolog search for metagenomic sequencing data. BMC bioinformatics, 20(Suppl 11), 276.
dc.descriptionThis work is licensed under a Creative Commons Attribution 4.0 International License.en_US
dc.description.abstractBackground A crucial task in metagenomic analysis is to annotate the function and taxonomy of the sequencing reads generated from a microbiome sample. In general, the reads can either be assembled into contigs and searched against reference databases, or individually searched without assembly. The first approach may suffer from fragmentary and incomplete assembly, while the second is hampered by the reduced functional signal contained in the short reads. To tackle these issues, we have previously developed GRASP (Guided Reference-based Assembly of Short Peptides), which accepts a reference protein sequence as input and aims to assemble its homologs from a database containing fragmentary protein sequences. In addition to a gene-centric assembly tool, GRASP also serves as a homolog search tool when using the assembled protein sequences as templates to recruit reads. GRASP has significantly improved recall rate (60–80% vs. 30–40%) compared to other homolog search tools such as BLAST. However, GRASP is both time- and space-consuming. Subsequently, we developed GRASPx, which is 30X faster than GRASP. Here, we present a completely redesigned algorithm, GRASP2, for this computational problem.

Results GRASP2 utilizes Burrows-Wheeler Transformation (BWT) and FM-index to perform assembly graph generation, and reduces the search space by employing a fast ungapped alignment strategy as a filter. GRASP2 also explicitly generates candidate paths prior to alignment, which effectively uncouples the iterative access of the assembly graph and alignment matrix. This strategy makes the execution of the program more efficient under current computer architecture, and contributes to GRASP2’s speedup.

GRASP2 is 8-fold faster than GRASPx (and 250-fold faster than GRASP) and uses 8-fold less memory while maintaining the original high recall rate of GRASP. GRASP2 reaches ~ 80% recall rate compared to that of ~ 40% generated by BLAST, both at a high precision level (> 95%). With such a high performance, GRASP2 is only ~3X slower than BLASTP.

Conclusion GRASP2 is a high-performance gene-centric and homolog search tool with significant speedup compared to its predecessors, which makes GRASP2 a useful tool for metagenomics data analysis, GRASP2 is implemented in C++ and is freely available from
dc.description.sponsorshipUniversity of Kansas New Faculty General Research Fund allocation #2302114en_US
dc.description.sponsorshipNational Science Foundation EPSCoR First Awards in Microbiome Researchen_US
dc.rights© The Author(s). 2019en_US
dc.titleGRASP2: Fast and memory-efficient gene-centric assembly and homolog search for metagenomic sequencing dataen_US
kusw.kuauthorZhong, Cuncong
kusw.kudepartmentElectrical Engineering and Computer Scienceen_US
kusw.oanotesPer Sherpa Romeo 11/19/2020:

BMC Bioinformatics [Open panel below]Publication Information TitleBMC Bioinformatics [English] ISSNsElectronic: 1471-2105 URL PublishersBMC [Commercial Publisher] DOAJ Listing Requires APCYes [Data provided by DOAJ] [Open panel below]Publisher Policy Open Access pathways permitted by this journal's policy are listed below by article version. Click on a pathway for a more detailed view.

Published Version NoneCC BYPMC Any Website, Journal Website, +3 OA PublishingThis pathway includes Open Access publishing EmbargoNo Embargo LicenceCC BY Copyright OwnerAuthors Publisher Deposit PubMed Central Europe PubMed Central Location Any Website Author's Homepage Institutional Repository Named Repository (PubMed Central) Journal Website Conditions Copy of License must accompany any deposit. Published source must be acknowledged Must link to publisher version with DOI
kusw.oaversionScholarly/refereed, publisher versionen_US
kusw.oapolicyThis item meets KU Open Access policy criteria.en_US

Files in this item


This item appears in the following Collection(s)

Show simple item record

© The Author(s). 2019
Except where otherwise noted, this item's license is described as: © The Author(s). 2019