ATTENTION: The software behind KU ScholarWorks is being upgraded to a new version. Starting July 15th, users will not be able to log in to the system, add items, nor make any changes until the new version is in place at the end of July. Searching for articles and opening files will continue to work while the system is being updated.
If you have any questions, please contact Marianne Reed at mreed@ku.edu .
A NEW METHODOLOGY FOR IDENTIFYING INTERFACE RESIDUES INVOLVED IN BINDING PROTEIN COMPLEXES
dc.contributor.advisor | Chen, Xue-wen | |
dc.contributor.author | Jeong, Jong Cheol | |
dc.date.accessioned | 2012-03-01T20:16:35Z | |
dc.date.available | 2012-03-01T20:16:35Z | |
dc.date.issued | 2011-12-31 | |
dc.date.submitted | 2011 | |
dc.identifier.other | http://dissertations.umi.com/ku:11862 | |
dc.identifier.uri | http://hdl.handle.net/1808/8783 | |
dc.description.abstract | Genome-sequencing projects with advanced technologies have rapidly increased the amount of protein sequences, and demands for identifying protein interaction sites are significantly increased due to its impact on understanding cellular process, biochemical events and drug design studies. However, the capacity of current wet laboratory techniques is not enough to handle the exponentially growing protein sequence data; therefore, sequence based predictive methods identifying protein interaction sites have drawn increasing interest. In this article, a new predictive model which can be valuable as a first approach for guiding experimental methods investigating protein-protein interactions and localizing the specific interface residues is proposed. The proposed method extracts a wide range of features from protein sequences. Random forests framework is newly redesigned to effectively utilize these features and the problems of imbalanced data classification commonly encountered in binding site predictions. The method is evaluated with 2,829 interface residues and 24,616 non-interface residues extracted from 99 polypeptide chains in the Protein Data Bank. The experimental results show that the proposed method performs significantly better than two other conventional predictive methods and can reliably predict residues involved in protein interaction sites. As blind tests, the proposed method predicts interaction sites and constructs three protein complexes: the DnaK molecular chaperone system, 1YUW and 1DKG, which provide new insight into the sequence-function relationship. Finally, the robustness of the proposed method is assessed by evaluating the performances obtained from four different ensemble methods. | |
dc.format.extent | 114 pages | |
dc.language.iso | en | |
dc.publisher | University of Kansas | |
dc.rights | This item is protected by copyright and unless otherwise specified the copyright of this thesis/dissertation is held by the author. | |
dc.subject | Bioinformatics | |
dc.subject | Computer science | |
dc.subject | Biomedical engineering | |
dc.subject | Interface residues | |
dc.subject | Machine learning | |
dc.subject | Properties of amino acids | |
dc.subject | Protein binding | |
dc.subject | Protein-protein interactions | |
dc.subject | Protein sequence analysis | |
dc.title | A NEW METHODOLOGY FOR IDENTIFYING INTERFACE RESIDUES INVOLVED IN BINDING PROTEIN COMPLEXES | |
dc.type | Thesis | |
dc.contributor.cmtemember | Huan, Luke | |
dc.contributor.cmtemember | Luo, Bo | |
dc.thesis.degreeDiscipline | Electrical Engineering & Computer Science | |
dc.thesis.degreeLevel | M.S. | |
kusw.oastatus | na | |
dc.identifier.orcid | https://orcid.org/0000-0002-5024-2927 | |
kusw.oapolicy | This item does not meet KU Open Access policy criteria. | |
kusw.bibid | 7643363 | |
dc.rights.accessrights | openAccess |
Files in this item
This item appears in the following Collection(s)
-
Engineering Dissertations and Theses [1055]
-
Theses [4088]