Abstract
Across several medical fields, developing an approach for disease classification is an important challenge. The usual procedure is to fit a model for the longitudinal response in the healthy population, a different model for the longitudinal response in the diseased population, and then apply Bayes' theorem to obtain disease probabilities given the responses. Unfortunately, when substantial heterogeneity exists within each population, this type of Bayes classification may perform poorly. In this article, we develop a new approach by fitting a Bayesian nonparametric model for the joint outcome of disease status and longitudinal response, and then we perform classification through the clustering induced by the Dirichlet process. This approach is highly flexible and allows for multiple subpopulations of healthy, diseased, and possibly mixed membership. In addition, we introduce an Markov chain Monte Carlo sampling scheme that facilitates the assessment of the inference and prediction capabilities of our model. Finally, we demonstrate the method by predicting pregnancy outcomes using longitudinal profiles on the human chorionic gonadotropin beta subunit hormone levels in a sample of Chilean women being treated with assisted reproductive therapy.
Original language | English |
---|---|
Pages (from-to) | 209-225 |
Number of pages | 17 |
Journal | Biostatistics |
Volume | 24 |
Issue number | 1 |
DOIs | |
State | Published - 1 Jan 2023 |
Externally published | Yes |
Keywords
- Bayesian nonparametric
- Dirichlet process
- Longitudinal data
- Model-based classification