Development of Risk Prediction Models for Severe Periodontitis in a Thai Population: Statistical and Machine-Learning Approaches
1 Department of Clinical Epidemiology and Biostatistics, Faculty of Medicine Ramathibodi Hospital, Mahidol University, Bangkok, Thailand
2 Department of Periodontology, Faculty of Dentistry, Chulalongkorn University, Bangkok, Thailand
3 Center of Excellence in Periodontal Disease and Dental Implant, Chulalongkorn University, Bangkok, Thailand
4 Department of Medicine, Faculty of Medicine Ramathibodi Hospital, Mahidol University, Bangkok, Thailand
5 Centre for Public Health, School of Medicine, Dentistry, and Biomedical Sciences, Queen's University Belfast, Northern Ireland, UK
6 School of Medicine and Public Health, University of Newcastle, Australia
7 Hunter Medical Research Institute, Newcastle, Australia
Abstract
Background: Severe periodontitis affects 26% of Thai adults and 11.2% of adults globally and is characterized by the loss of alveolar bone height. Full-mouth examination by periodontal probing is the gold standard for diagnosis but is time- and resource-intensive. A screening model to identify those at high risk would offer a targeted approach and reduce workload for dentists.
Objective: To compare the performance of screening models developed using statistical and machine learning approaches for the risk prediction of severe periodontitis.
Methods: This study used data from the prospective Electricity Generating Authority of Thailand cohort (2008 and 2013 surveys). The outcome was severe periodontitis defined by the CDC–AAP criteria. Risk prediction models were developed using mixed-effects logistic regression (MELR), recurrent neural network, mixed-effects support vector machine, and mixed-effects decision tree, using 21 candidate features.
Results: A total of 3,883 observations from 2,086 participants were split into development (80.1%) and validation (19.9%) sets with periodontitis prevalences of 34.4% and 34.1%, respectively. The final MELR model contained 6 features (gender, education, smoking, diabetes mellitus, number of teeth, and plaque score) with an AUC of 0.983 (95% CI 0.977–0.989) and LR+ of 11.9 (95% CI 8.8–16.3). Machine learning models yielded lower performance, with AUC values of 0.712, 0.698, and 0.662 for the recurrent neural network, support vector machine, and decision tree models, respectively.
Conclusions: The MELR model may be more useful than machine learning for large-scale screening to identify those at high risk of severe periodontitis. External validation using data from other centres is required to evaluate generalizability.