John Carlin says, “‘Identifying variables that independently predict…’ is not a well-defined research task”
Summary
John Carlin's recent work in epidemiological research methodology emphasizes classifying research questions into descriptive, predictive, and causal types to achieve clarity of purpose. The author of this analysis, while agreeing with the classification, suggests that descriptive and causal questions can be viewed as special cases of predictive questions, aligning with a broader data science perspective. Carlin also critiques the practice of dichotomizing research questions, such as "is this an independent prognostic factor?", advocating instead for assessing predictive value by comparing models with and without specific variables. The author supports Carlin's stance against dichotomization, noting their shared contributions to this theme in "Bayesian Data Analysis." However, the author expresses confusion regarding Carlin's specific method for assessing predictive value and questions Carlin's focus on "adjustment" debates, arguing that sharply defined research questions are universally beneficial, not just for statistical adjustments.
Key takeaway
For Data Scientists and Research Scientists designing epidemiological studies, prioritize framing research questions as descriptive, predictive, or causal to enhance clarity. Avoid dichotomous questions like "is X a prognostic factor?" and instead assess variable importance by comparing model performance with and without the variable. This approach aligns with robust statistical practices and helps avoid misinterpretations of "independent effects."
Key insights
Classifying research questions into descriptive, predictive, and causal types enhances clarity in epidemiological studies.
Principles
- Avoid dichotomous research questions.
- Assess predictive value by model comparison.
Method
Evaluate a variable's predictive value by comparing a model's performance with and without that variable, rather than relying solely on its "independent effect" in a multivariable regression.
In practice
- Frame questions as predictive, not dichotomous.
- Compare model performance for variable assessment.
Topics
- Research Question Classification
- Predictive Modeling
- Causal Inference
- Bayesian Data Analysis
- Multivariable Regression
Best for: Data Scientist, Research Scientist, AI Scientist
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Statistical Modeling, Causal Inference, and Social Science.