Machine Learning-Based Pre-Test Risk Stratification for PCR-Confirmed Chlamydia Using Patient-Reported Data and Urine Biomarkers

· Source: cs.LG updates on arXiv.org · Field: Science & Research — Health & Medical Research, Artificial Intelligence & Machine Learning, Data Science & Analytics · Depth: Expert, extended

Summary

A study evaluated machine learning models for pre-test risk stratification (PTRS) of PCR-confirmed Chlamydia trachomatis infection using non-invasive clinical data. Researchers analyzed a dataset of 93 urine samples with PCR reference labels, employing five supervised classifiers (Logistic Regression, Decision Tree, Random Forest, XGBoost, k-Nearest Neighbors) and three feature groups: patient-reported history/symptoms (F1), urine biomarkers (F2), and their combination (F3). Models using F1 data achieved moderate discrimination (AUC up to 0.72) but showed high variability. Urine biomarker models (F2) demonstrated more consistent performance, with ensemble methods yielding the strongest results. Combining feature groups (F3) marginally increased peak AUC and reduced performance variability, enhancing robustness. The findings suggest urine biomarkers provide a reliable predictive signal complementary to patient-reported information, supporting their integration into screening workflows, especially in decentralized or resource-constrained settings.

Key takeaway

For public health officials and clinical laboratory managers seeking to optimize Chlamydia screening in resource-constrained settings, consider implementing pre-test risk stratification models that incorporate both patient-reported data and urine biomarkers. This approach can help prioritize individuals for PCR testing, improving efficiency without replacing definitive molecular diagnostics. Focus on ensemble-based machine learning models for their enhanced robustness and consistent performance.

Key insights

Urine biomarkers offer a reliable, consistent signal for Chlamydia pre-test risk stratification, complementing patient-reported data.

Principles

Method

Machine learning models were trained on patient-reported data, urine biomarkers, or their combination, using stratified 5-fold cross-validation and bootstrap confidence intervals to assess Chlamydia risk.

In practice

Topics

Best for: AI Scientist, Research Scientist, Machine Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.LG updates on arXiv.org.