How Bandwidth Selection Algorithms Impact Exploratory Data Analysis Using Kernel Density Estimation
dc.contributor.advisor | Woods, Carol M | |
dc.contributor.author | Harpole, Jared Kenneth | |
dc.date.accessioned | 2013-08-24T22:34:59Z | |
dc.date.available | 2013-08-24T22:34:59Z | |
dc.date.issued | 2013-05-31 | |
dc.date.submitted | 2013 | |
dc.identifier.other | http://dissertations.umi.com/ku:12636 | |
dc.identifier.uri | http://hdl.handle.net/1808/11725 | |
dc.description.abstract | Exploratory data analysis (EDA) is important, yet often overlooked in the social and behavioral sciences. Graphical analysis of one's data is central to EDA. A viable method of estimating and graphing the underlying density in EDA is kernel density estimation (KDE). A problem with using KDE involves correctly specifying the bandwidth to portray an accurate representation of the density. The purpose of the present study is to empirically evaluate how the choice of bandwidth in KDE influences recovery of the true density. Simulations were carried out that compared five bandwidth selection methods [Sheather-Jones plug-in (SJDP), Normal rule of thumb (NROT), Silverman's rule of thumb (SROT), Least squares cross-validation (LSCV), and Biased cross-validation (BCV)], using four true density shapes (Standard Normal, Positively Skewed, Bimodal, and Skewed Bimodal), and eight sample sizes (25, 50, 75, 100, 250, 500, 1000, 2000). Results indicated that overall SJDP performed best. However, this was specifically true for samples between 250 and 2,000. For smaller samples (N = 25 to 100), SROT performed best. Thus, either the SJDP or SROT is recommended depending on the sample size. | |
dc.format.extent | 48 pages | |
dc.language.iso | en | |
dc.publisher | University of Kansas | |
dc.rights | This item is protected by copyright and unless otherwise specified the copyright of this thesis/dissertation is held by the author. | |
dc.subject | Quantitative psychology | |
dc.subject | Psychometrics | |
dc.subject | Statistics | |
dc.subject | Bandwidth selection | |
dc.subject | Exploratory data analysis | |
dc.subject | Graphical analysis | |
dc.subject | Kernel density estimation | |
dc.title | How Bandwidth Selection Algorithms Impact Exploratory Data Analysis Using Kernel Density Estimation | |
dc.type | Thesis | |
dc.contributor.cmtemember | Deboeck, Pascal R. | |
dc.contributor.cmtemember | Johnson, Paul | |
dc.thesis.degreeDiscipline | Psychology | |
dc.thesis.degreeLevel | M.A. | |
kusw.oastatus | na | |
kusw.oapolicy | This item does not meet KU Open Access policy criteria. | |
kusw.bibid | 8086241 | |
dc.rights.accessrights | openAccess |
Files in this item
This item appears in the following Collection(s)
-
Psychology Dissertations and Theses [459]
-
Theses [3976]