Show simple item record

dc.contributor.advisorHuan, Jun
dc.contributor.authorZhang, Jintao
dc.date.accessioned2015-10-12T22:30:47Z
dc.date.available2015-10-12T22:30:47Z
dc.date.issued2012-12-31
dc.date.submitted2012
dc.identifier.otherhttp://dissertations.umi.com/ku:12388
dc.identifier.urihttp://hdl.handle.net/1808/18634
dc.description.abstractAdverse drug reactions (ADRs) present a major concern for drug safety and are a major obstacle in modern drug development. They account for about one-third of all late-stage drug failures, and approximately 4% of all new chemical entities are withdrawn from the market due to severe ADRs. Although off-target drug interactions are considered to be the major causes of ADRs, the adverse reaction profile of a drug depends on a wide range of factors such as specific features of drug chemical structures, its ADME/PK properties, interactions with proteins, the metabolic machinery of the cellular environment, and the presence of other diseases and drugs. Hence computational modeling for ADRs prediction is highly complex and challenging. We propose a set of statistical learning models for effective ADRs prediction systematically from multiple perspectives. We first discuss available data sources for protein-chemical interactions and adverse drug reactions, and how the data can be represented for effective modeling. We also employ biological network analysis approaches for deeper understanding of the chemical biological mechanisms underlying various ADRs. In addition, since protein-chemical interactions are an important component for ADRs prediction, identifying these interactions is a crucial step in both modern drug discovery and ADRs prediction. The performance of common supervised learning methods for predicting protein-chemical interactions have been largely limited by insufficient availability of binding data for many proteins. We propose two multi-task learning (MTL) algorithms for jointly predicting active compounds of multiple proteins, and our methods outperform existing states of the art significantly. All these related data, methods, and preliminary results are helpful for understanding the underlying mechanisms of ADRs and further studies. ADRs data are complex and noisy, and in many cases we do not fully understand the molecular mechanisms of ADRs. Due to the noisy and heterogeneous data set available for some ADRs, we propose a sparse multi-view learning (MVL) algorithm for predicting a specific ADR - drug-induced QT prolongation, a major life-threatening adverse drug effect. It is crucial to predict the QT prolongation effect as early as possible in drug development. MVL algorithms work very well when complex data from diverse domains are involved and only limited labeled examples are available. Unlike existing MVL methods that use L2-norm co-regularization to obtain a smooth objective function, we propose an L1-norm co-regularized MVL algorithm for predicting QT prolongation, reformulate the objective function, and obtain its gradient in the analytic form. We optimize the decision functions on all views simultaneously and achieve 3-4 fold higher computational speedup, comparing to previous L2-norm co-regularized MVL methods that alternately optimizes one view with the other views fixed until convergence. L1-norm co-regularization enforces sparsity in the learned mapping functions and hence the results are expected to be more interpretable. The proposed MVL method can only predict one ADR at a time. It would be advantageous to predict multiple ADRs jointly, especially when these ADRs are highly related. Advanced modeling techniques should be investigated to better utilize ADR data for more effective ADRs prediction. We study the quantitative relationship among drug structures, drug-protein interaction profiles, and drug ADRs. We formalize the modeling problem as a multi-view (drug structure data and drug-protein interaction profile data) multi-task (one drug may cause multiple ADRs and each ADR is a task) classification problem. We apply the co-regularized MVL on each ADR and use regularized MTL to increase the total sample size and improve model performance. Experimental studies on the ADR data set demonstrate the effectiveness of our MVMT algorithm. Cluster analysis and significant feature identification using the results of our models reveal interesting hidden insight. In summary, we use computational methods such as biological network analysis, multi-task learning, multi-view learning, and inductive multi-view multi-task learning to systematically investigate the modeling of various ADRs, and construct highly accurate models for ADRs prediction. We also have significant contribution on proposing novel supervised and semi-supervised learning algorithms, which can be applied to many other real-world applications.
dc.format.extent165 pages
dc.language.isoen
dc.publisherUniversity of Kansas
dc.rightsCopyright held by the author.
dc.subjectBioinformatics
dc.subjectInformation technology
dc.subjectadverse drug reaction
dc.subjectboosting
dc.subjectco-regularization
dc.subjectinductive learning
dc.subjectmulti-task learning
dc.subjectmulti-view learning
dc.titleMulti-task and Multi-view Learning for Predicting Adverse Drug Reactions
dc.typeDissertation
dc.contributor.cmtememberVakser, Ilya
dc.contributor.cmtememberIm, Wonpil
dc.contributor.cmtememberDeeds, Eric
dc.contributor.cmtememberPotetz, Brian
dc.thesis.degreeDisciplineInformation Technology
dc.thesis.degreeLevelPh.D.
dc.rights.accessrightsopenAccess


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record