Novel algorithm for elucidating biologically relevant chemical diversity metrics

Theertham, Bhargav

View/Open

Theertham_Bhargav_2007_5349260.pdf (657.9Kb)

Issue Date

2007-05-31

Author

Theertham, Bhargav

Publisher

University of Kansas

Type

Thesis

Degree Level

M.S.

Discipline

Electrical Engineering and Computer Science

Rights

This item is protected by copyright and unless otherwise specified the copyright of this thesis/dissertation is held by the author.

Metadata

Show full item record

Abstract

Despite great advances in the efficiency of analytical and synthetic chemistry, the number of unique compounds that can be practically synthesized and evaluated as prospective pharmaceuticals is still limited. Given a known bioactive species, it is valuable to be able to readily identify a small subset of compounds likely to have similar or better activity. Many popular chemical diversity metrics do not perform very well in this role. A new emphasis on identifying diversity metrics that also encode biological trend information is thus emerging as a desired tool for guiding the assembly of targeted screening libraries. This thesis aims at developing novel algorithm that seeks to permit simultaneous evaluation of compound collections according to chemical diversity and potential bioactivity. An extensive set of descriptors are thus evaluated herein according to ability to differentiate chemical and biological similarity trends within compound sets for which screening results exist, and low-dimensional subsets are identified that retain such differentiation capacities. Bioactivity differentiation capacity is quantified as the ability to co-localize known bioactives into bioactive-rich clusters derived from K-means clustering. The descriptors are sorted according to relative variance across a set of training compounds, and filtered by mining increasingly finer meshes for pockets of descriptors whose exclusion from the model induces drastic drops in relative bioactive colocalization. This scheme is found to yield reasonable bioactive enrichment (greater than 50% of all bioactive compounds collected into clusters with enriched positive/negative rates) for screening data sets of some biological targets.

Description

Thesis (M.S.)--University of Kansas, Electrical Engineering and Computer Science, 2007.

URI

http://hdl.handle.net/1808/32119

Collections

Theses [4088]

The University of Kansas prohibits discrimination on the basis of race, color, ethnicity, religion, sex, national origin, age, ancestry, disability, status as a veteran, sexual orientation, marital status, parental status, gender identity, gender expression and genetic information in the University’s programs and activities. The following person has been designated to handle inquiries regarding the non-discrimination policies: Director of the Office of Institutional Opportunity and Access, IOA@ku.edu, 1246 W. Campus Road, Room 153A, Lawrence, KS, 66045, (785)864-6414, 711 TTY.