dc.contributor.author | Feehan, Ryan | |
dc.contributor.author | Franklin, Meghan W. | |
dc.contributor.author | Slusky, Joanna S. G. | |
dc.date.accessioned | 2021-12-10T18:49:34Z | |
dc.date.available | 2021-12-10T18:49:34Z | |
dc.date.issued | 2021-06-17 | |
dc.identifier.citation | Feehan, R., Franklin, M. W., & Slusky, J. (2021). Machine learning differentiates enzymatic and non-enzymatic metals in proteins. Nature communications, 12(1), 3712. https://doi.org/10.1038/s41467-021-24070-3 | en_US |
dc.identifier.uri | http://hdl.handle.net/1808/32277 | |
dc.description.abstract | Metalloenzymes are 40% of all enzymes and can perform all seven classes of enzyme reactions. Because of the physicochemical similarities between the active sites of metalloenzymes and inactive metal binding sites, it is challenging to differentiate between them. Yet distinguishing these two classes is critical for the identification of both native and designed enzymes. Because of similarities between catalytic and non-catalytic metal binding sites, finding physicochemical features that distinguish these two types of metal sites can indicate aspects that are critical to enzyme function. In this work, we develop the largest structural dataset of enzymatic and non-enzymatic metalloprotein sites to date. We then use a decision-tree ensemble machine learning model to classify metals bound to proteins as enzymatic or non-enzymatic with 92.2% precision and 90.1% recall. Our model scores electrostatic and pocket lining features as more important than pocket volume, despite the fact that volume is the most quantitatively different feature between enzyme and non-enzymatic sites. Finally, we find our model has overall better performance in a side-to-side comparison against other methods that differentiate enzymatic from non-enzymatic sequences. We anticipate that our model’s ability to correctly identify which metal sites are responsible for enzymatic activity could enable identification of new enzymatic mechanisms and de novo enzyme design. | en_US |
dc.publisher | Nature Research | en_US |
dc.rights | © The Author(s) 2021. This article is licensed under a Creative Commons Attribution 4.0 International License. | en_US |
dc.rights.uri | http://creativecommons.org/licenses/by/4.0/ | en_US |
dc.subject | Biocatalysis | en_US |
dc.subject | Machine learning | en_US |
dc.title | Machine learning differentiates enzymatic and non-enzymatic metals in proteins | en_US |
dc.type | Article | en_US |
kusw.kuauthor | Feehan, Ryan | |
kusw.kuauthor | Franklin, Meghan W. | |
kusw.kuauthor | Slusky, Joanna S. G. | |
kusw.kudepartment | Center for Computational Biology | en_US |
kusw.kudepartment | Molecular Biosciences | en_US |
kusw.oanotes | Per Sherpa Romeo 12/10/2021:Nature Communications
[Open panel below]Publication Information
TitleNature Communications [English]
ISSNsElectronic: 2041-1723
URLhttp://www.nature.com/ncomms/index.html
PublishersNature Research [Commercial Publisher]
DOAJ Listinghttps://doaj.org/toc/2041-1723
Requires APCYes [Data provided by DOAJ]
[Open panel below]Publisher Policy
Open Access pathways permitted by this journal's policy are listed below by article version. Click on a pathway for a more detailed view.Published Version
NoneCC BYPMC
Any Website, Journal Website
OA PublishingThis pathway includes Open Access publishing
EmbargoNo Embargo
LicenceCC BY 4.0
Copyright OwnerAuthors
Publisher Deposit
PubMed Central
Europe PMC
Location
Any Website
Journal Website
Conditions
Must link to publisher version
Publisher copyright and source must be acknowledged and DOI cited | en_US |
dc.identifier.doi | 10.1038/s41467-021-24070-3 | en_US |
dc.identifier.orcid | https://orcid.org/ 0000-0002-2435-671X | en_US |
dc.identifier.orcid | https://orcid.org/ 0000-0003-0842-6340 | en_US |
kusw.oaversion | Scholarly/refereed, publisher version | en_US |
kusw.oapolicy | This item meets KU Open Access policy criteria. | en_US |
dc.identifier.pmid | PMC8211803 | en_US |
dc.rights.accessrights | openAccess | en_US |