An interpretable machine learning model of cross-sectional U.S. county-level obesity prevalence using explainable artificial intelligence

Allen, Ben

View/Open

pone.0292341.pdf (1.657Mb)

Issue Date

2023-10-05

Author

Allen, Ben

Publisher

PLOS ONE

Type

Article

Article Version

Scholarly/refereed, publisher version

Rights

Copyright © 2023 Ben Allen This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Metadata

Show full item record

Abstract

Background There is considerable geographic heterogeneity in obesity prevalence across counties in the United States. Machine learning algorithms accurately predict geographic variation in obesity prevalence, but the models are often uninterpretable and viewed as a black-box.

Objective The goal of this study is to extract knowledge from machine learning models for county-level variation in obesity prevalence.

Methods This study shows the application of explainable artificial intelligence methods to machine learning models of cross-sectional obesity prevalence data collected from 3,142 counties in the United States. County-level features from 7 broad categories: health outcomes, health behaviors, clinical care, social and economic factors, physical environment, demographics, and severe housing conditions. Explainable methods applied to random forest prediction models include feature importance, accumulated local effects, global surrogate decision tree, and local interpretable model-agnostic explanations.

Results The results show that machine learning models explained 79% of the variance in obesity prevalence, with physical inactivity, diabetes, and smoking prevalence being the most important factors in predicting obesity prevalence.

Conclusions Interpretable machine learning models of health behaviors and outcomes provide substantial insight into obesity prevalence variation across counties in the United States.

Citation

Allen B. An interpretable machine learning model of cross-sectional U.S. county-level obesity prevalence using explainable artificial intelligence. PLoS One. 2023 Oct 5;18(10):e0292341. doi: 10.1371/journal.pone.0292341. PMID: 37796874; PMCID: PMC10553328

The University of Kansas prohibits discrimination on the basis of race, color, ethnicity, religion, sex, national origin, age, ancestry, disability, status as a veteran, sexual orientation, marital status, parental status, gender identity, gender expression and genetic information in the University’s programs and activities. The following person has been designated to handle inquiries regarding the non-discrimination policies: Director of the Office of Institutional Opportunity and Access, IOA@ku.edu, 1246 W. Campus Road, Room 153A, Lawrence, KS, 66045, (785)864-6414, 711 TTY.