Show simple item record

dc.contributor.advisorBrimacombe, Michael
dc.contributor.authorBimali, Milan
dc.date.accessioned2016-11-11T00:00:02Z
dc.date.available2016-11-11T00:00:02Z
dc.date.issued2015-12-31
dc.date.submitted2015
dc.identifier.otherhttp://dissertations.umi.com/ku:14370
dc.identifier.urihttp://hdl.handle.net/1808/21918
dc.description.abstractThe likelihood is a function of model parameter(s) and data using a pre-defined probability density function (pdf). Thus, the likelihood can be viewed as model-data combination that can be utilized to address questions of interest. The relative likelihood function is the likelihood function scaled by its mode so as to have its maximum at one. Unlike likelihood functions, relative likelihood functions have attracted little attention and use by statisticians. The proposed dissertation work explores the properties and applications of relative likelihood functions in examining the large-sample convergence properties of maximum likelihood estimator (MLE) and in relation to clustering. The dissertation consists of three chapters. The first chapter presents a simulation based approach to examine the relationship between sample size and the asymptotic behavior of the MLE. The convergence of the observed relative likelihood function (RLF) to the asymptotic relative likelihood function (RLF) is assessed for different sample sizes using two measures of convergence; difference in areas and dissimilarity in shape. The proposed approach has been applied to data from the literature as well as to data simulated from different exponential family distributions. The second chapter proposes a novel clustering approach based on the observed RLFs. Observations in the dataset are assumed to follow a known distribution and observed RLFs are obtained. The observed RLFs are further scaled by the inverse of the asymptotic variation (Fisher Information) evaluated at the mode of the likelihood functions. The weighted RLFs reflect information based similarity among observations in the data. A data matrix is then developed by evaluating the weighted RLFs at different values in the parameter space. The data matrix allows for direct application of standard clustering algorithms such as k-means algorithm. This clustering approach was applied to simulated dataset based on real data and to datasets simulated from known distributions. The third chapter examines the proposed RLF based clustering approach to a publicly available gene expression dataset consisting of 70 gene expression profiles used to classify patients into prognostic groups. The agreement between the RLF clustering results and previous classification is also presented. The clusters obtained are also examined in relation to differences in two clinical features – time to overall survival; and time to metastases.
dc.format.extent128 pages
dc.language.isoen
dc.publisherUniversity of Kansas
dc.rightsCopyright held by the author.
dc.subjectBiostatistics
dc.subjectAsymptotic Convergence
dc.subjectClustering
dc.subjectFisher Information
dc.subjectMaximum Likelihood Estimator
dc.subjectRelative Likelihood Function
dc.subjectSimulation
dc.titleA Likelihood Based Approach to the Assessment of Large Sample Convergence and Model Based Clustering.
dc.typeDissertation
dc.contributor.cmtememberFridley, Brooke L.
dc.contributor.cmtememberDiaz, Francisco J.
dc.contributor.cmtememberWick, Jo A.
dc.contributor.cmtememberApte, Udayan
dc.thesis.degreeDisciplineBiostatistics
dc.thesis.degreeLevelPh.D.
dc.identifier.orcid
dc.rights.accessrightsopenAccess


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record