Show simple item record

dc.contributor.advisorMahnken, Jonathan D
dc.contributor.authorLu, Pengcheng
dc.date.accessioned2020-03-23T21:12:45Z
dc.date.available2020-03-23T21:12:45Z
dc.date.issued2019-05-31
dc.date.submitted2019
dc.identifier.otherhttp://dissertations.umi.com/ku:16568
dc.identifier.urihttp://hdl.handle.net/1808/30153
dc.description.abstractThere is no phenomenal method practitioners can use as a appropriate tool for model validation when sparse data are presented in multiple logistic regression models. The characteristics of sparsity, i.e. very few number of observations falling in either grouped or individual covariate patterns, will invalidate the asymptotic chi-square distribution which requires large expected frequencies in each group or bin. Among those tests, Hosmer-Lemeshow (HL) is the most well-known and widely used as the standard test in assessing logistic regression models since its introducing. The disefficiencies of Hosmer-Lemeshow method has been pointed out for years, there is no dominate alternative one emerged yet by far, and the research in assessing logistic regression model fit when sparse data are presented is still very active. Two common methods among a few other proposed methods, namely Copas's unweighted residual sum of squares (RSS) and Su and Wei's & Lin's cumulative sums of residuals (CUMSUM), perform seemly better than the HL in some scenarios, however the limitation of those studies are obvious when those alternatives were introduced: (1) the sample size of the simulation is small (up to 500 observations), (2) the design matrix is relatively simple (usually one continuous and one categorical predictor variables), (3) the number of scenarios considered in their studies are limited, (4) the simulation setups are quite subjective. Due to these reasons, there is no well-established guidelines on model validation available for statistical practitioners' daily use when using a multiple logistic regression model with sparse data, a common approach is suggested to check model validation by investigating all those existing goodness-of-fit tests to see if they provide similar evidence of lack of fit. Therefore, it is crucial to assess the performance of each method through a comprehensive comparative study. We designed the comparison differently in at least four directions as we mentioned above: varied and expanded sample size, relatively complicated design matrix, more scenarios including adding (over-fitting) continuous/categorical predictor variables and omitting (under-fitting) main effect and /or interaction terms, and a more flexible or robust simulation setting in terms of many randomly sampled models rather than very few pre-specified models were investigated. Furthermore, we proposed a goodness-of-fit test by introducing a new method to partition the fitted values based on the commonly known conditions for the limiting distribution of chi-square type statistics for grouped data, which to some extend would overcome the disadvantage of the HL test when the expected counts in some bins are small (usually the cut-off is set as less than five). We also conducted the comparative study by including our proposed method. We summarized the varied goodness-of-fit results in terms of empirical level of significance and power and offered recommendations based on our more generalized simulation studies.
dc.format.extent148 pages
dc.language.isoen
dc.publisherUniversity of Kansas
dc.rightsCopyright held by the author.
dc.subjectBiostatistics
dc.subjectContinuous Covariates
dc.subjectGoodness-of-fit Test
dc.subjectLogistic Regression
dc.subjectSparse Data
dc.titleTopics in Goodness-of-fit Test for Logistic Regression Models with Continuous Covariates
dc.typeDissertation
dc.contributor.cmtememberGajewski, Byron J
dc.contributor.cmtememberHe, Jianghua
dc.contributor.cmtememberKeighley, John
dc.contributor.cmtememberChoi, Won S
dc.thesis.degreeDisciplineBiostatistics
dc.thesis.degreeLevelPh.D.
dc.rights.accessrightsopenAccess


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record