Show simple item record

dc.contributor.advisorLittle, Todd D.
dc.contributor.authorHoward, Waylon Justin
dc.date.accessioned2013-02-17T16:35:55Z
dc.date.available2013-02-17T16:35:55Z
dc.date.issued2012-08-31
dc.date.submitted2012
dc.identifier.otherhttp://dissertations.umi.com/ku:12364
dc.identifier.urihttp://hdl.handle.net/1808/10815
dc.description.abstractThe purpose of this dissertation is to address an important issue in the imputation of missing data in large data sets. The issue can arise in any analysis in which auxiliary variables are used to inform a modern missing data handling procedure (e.g., FIML, MI) to support the missing at random assumption, reduce bias and decrease standard errors. The problem is that researchers suggest an "inclusive strategy" where as many auxiliary variables are included as possible. However, the model becomes more complex with the addition of each additional auxiliary variable, so there is a practical limit to the number of auxiliary variables that can be successfully included. Beyond this limit, the model will fail to converge. Large data projects can present a challenge because it is possible to have hundreds of potential auxiliary variables to inform the missing data handling procedure, especially when non-linear information is included. The dissertation is divided into the following sections: 1) a brief discussion of the issue of missing data; 2) a review of the history of missing data including theory and existing solutions regarding handling missingness; 3) an assessment of the use of auxiliary variables in missing data handling; 4) a discussion of convergence failure with modern missing data methods; 5) a basic introduction to principal component analysis; 6) the introduction of an alternative strategy to address the large number of auxiliary variables issue; and finally, 7) a demonstration of the potential of the principal component scores as auxiliary variables approach by applying it to the analysis of simulated and empirical data.
dc.format.extent286 pages
dc.language.isoen
dc.publisherUniversity of Kansas
dc.rightsThis item is protected by copyright and unless otherwise specified the copyright of this thesis/dissertation is held by the author.
dc.subjectPsychology
dc.subjectStatistics
dc.subjectAuxiliary variables
dc.subjectHistory
dc.subjectLarge datasets
dc.subjectMissing data
dc.subjectPrincipal component analysis (PCA)
dc.titleUSING PRINCIPAL COMPONENT ANALYSIS (PCA) TO OBTAIN AUXILIARY VARIABLES FOR MISSING DATA IN LARGE DATA SETS
dc.typeDissertation
dc.contributor.cmtememberJohnson, Paul
dc.contributor.cmtememberWalker, Dale
dc.contributor.cmtememberWoods, Carol
dc.contributor.cmtememberWu, Wei
dc.thesis.degreeDisciplinePsychology
dc.thesis.degreeLevelPh.D.
kusw.oastatusna
kusw.oapolicyThis item does not meet KU Open Access policy criteria.
dc.rights.accessrightsopenAccess


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record