Methodological Developments for the Analysis of Biological Samples in the Presence of Compositional Effects

View/ Open
Issue Date
2020-05-31Author
Meier, Richard
Publisher
University of Kansas
Format
161 pages
Type
Dissertation
Degree Level
Ph.D.
Discipline
Biostatistics
Rights
Copyright held by the author.
Metadata
Show full item recordAbstract
Compositional data, in which a vector of observed variables are constrained by a sum total, imposes a unique correlation structure among its components. Considering the abundance of components in a biological sample is inherently compositional, that is to say it is constrained by the amount of collected biomass, it is not surprising that compositional data frequently arises in the field of biomedical research. Failing to account for compositional effects compromises statistical inference, and may lead to spurious results that are not reproducible. Development and application of statistical techniques that honor compositionality in the context of biomedical research is therefore of great importance. In this dissertation, we first investigate microbial composition in the pancreatic microbiome (not well characterized prior to this research) and surrounding tissue using a variety of different statistical methods. We identify similarities between tissue types and differences between tissues from subjects with different types of pancreatic cancer and tissues from non-cancer subjects. Identification of microbes commonly found in oral cavities then motivates the question whether consistent patterns of the microbial landscape with respect to disease can be found between the mouth and the gut. Since there is no established method to test for these patterns in microbiome data, we continue by presenting a suitable Bayesian testing framework that is able to address the unique challenges posed by microbial abundance data. We elaborate how the method simultaneously applies to a variety of different data models and different types of estimates of microbial abundance, and demonstrate its ability to detect desired associations via simulation studies. Further, analysis of microbiome profiles derived from gut and oral cavity samples collected from pancreatic cancer cases are used to successfully identify microbes that exhibit consistent patterns of interest. This dissertation closes with methodological developments for the analysis of DNA methylation levels of bulk samples with heterogeneous cell composition. Building on novel modelling approaches that are able to detect cell type specific methylation based on bulk samples, we introduce a Bayesian hierarchical modelling strategy that leverages spatial correlation of proximal CpG dinucleotides. We elaborate how our method was empirically motivated by whole blood methylation data of isolated cell types and demonstrate its performance improvement in terms of prediction accuracy and statistical power compared to non-spatial models.
Collections
- Dissertations [4660]
Items in KU ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.
We want to hear from you! Please share your stories about how Open Access to this item benefits YOU.