Machine learning algorithms to predict the 1 year unfavourable prognosis for advanced schistosomiasis.
Short-term prognosis of advanced schistosomiasis has not been well studied. We aimed to construct prognostic models using machine learning algorithms and to identify the most important predictors by utilizing routinely available data under the government medical assistance program. An established database of advanced schistosomiasis in Hunan, China was utilized for analysis. A total of 9541 patients for the period from January 2008 to December 2018 were enrolled in this study. Candidate predictors were selected from demographics, clinical features, medical examinations and test results. We applied five machine learning algorithms to construct 1 year prognostic models: logistic regression (LR), decision tree (DT), random forest (RF), artificial neural network (ANN) and extreme gradient boosting (XGBoost). An area under the receiver operating characteristic curve (AUC) was used to evaluate the model performance. The important predictors of the optimal model for unfavourable prognosis within 1 year were identified and ranked. There were 1249 (13.1%) cases having unfavourable prognoses within 1 year of discharge. The mean age of all participants was 61.94 years, of whom 70.9% were male. In general, XGBoost showed the best predictive performance with the highest AUC (0.846; 95% confidence interval (CI): 0.821, 0.871), compared with LR (0.798; 95% CI: 0.770, 0.827), DT (0.766; 95% CI: 0.733, 0.800), RF (0.823; 95% CI: 0.796, 0.851), and ANN (0.806; 95% CI: 0.778, 0.835). Five most important predictors identified by XGBoost were ascitic fluid volume, haemoglobin (HB), total bilirubin (TB), albumin (ALB), and platelets (PT). We proposed XGBoost as the best algorithm for the evaluation of a 1 year prognosis of advanced schistosomiasis. It is considered to be a simple and useful tool for the short-term prediction of an unfavourable prognosis for advanced schistosomiasis in clinical settings.