Machine Learning-Based Pre-Test Risk Stratification for PCR-Confirmed Chlamydia Using Patient-Reported Data and Urine Biomarkers
Summary
A study evaluated machine learning models for pre-test risk stratification (PTRS) of PCR-confirmed Chlamydia trachomatis infection using non-invasive clinical data. Researchers analyzed a dataset of 93 urine samples with PCR reference labels, employing five supervised classifiers (Logistic Regression, Decision Tree, Random Forest, XGBoost, k-Nearest Neighbors) and three feature groups: patient-reported history/symptoms (F1), urine biomarkers (F2), and their combination (F3). Models using F1 data achieved moderate discrimination (AUC up to 0.72) but showed high variability. Urine biomarker models (F2) demonstrated more consistent performance, with ensemble methods yielding the strongest results. Combining feature groups (F3) marginally increased peak AUC and reduced performance variability, enhancing robustness. The findings suggest urine biomarkers provide a reliable predictive signal complementary to patient-reported information, supporting their integration into screening workflows, especially in decentralized or resource-constrained settings.
Key takeaway
For public health officials and clinical laboratory managers seeking to optimize Chlamydia screening in resource-constrained settings, consider implementing pre-test risk stratification models that incorporate both patient-reported data and urine biomarkers. This approach can help prioritize individuals for PCR testing, improving efficiency without replacing definitive molecular diagnostics. Focus on ensemble-based machine learning models for their enhanced robustness and consistent performance.
Key insights
Urine biomarkers offer a reliable, consistent signal for Chlamydia pre-test risk stratification, complementing patient-reported data.
Principles
- Feature integration enhances model robustness.
- Ensemble models yield stable, strong discrimination.
- Small datasets require conservative model complexity.
Method
Machine learning models were trained on patient-reported data, urine biomarkers, or their combination, using stratified 5-fold cross-validation and bootstrap confidence intervals to assess Chlamydia risk.
In practice
- Prioritize PCR testing for high-risk individuals.
- Integrate non-invasive data into screening workflows.
- Consider ensemble models for clinical risk prediction.
Topics
- Machine Learning
- Chlamydia trachomatis
- Pre-Test Risk Stratification
- Urine Biomarkers
- Patient-Reported Data
Best for: AI Scientist, Research Scientist, Machine Learning Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by cs.LG updates on arXiv.org.