ReSS: Learning Reasoning Models for Tabular Data Prediction via Symbolic Scaffold

2026-04-16 · Source: cs.AI updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics, Emerging Technologies & Innovation · Depth: Expert, extended

Summary

ReSS is a systematic framework designed to enhance tabular data prediction by integrating symbolic and neural reasoning models, specifically targeting high-stakes domains like healthcare and finance where both accuracy and human-understandable explanations are critical. The framework utilizes a decision-tree model to extract instance-level decision paths, which serve as "symbolic scaffolds." These scaffolds guide a Large Language Model (LLM) to generate grounded natural-language reasoning that strictly adheres to the underlying decision logic. This process creates a high-quality dataset used to fine-tune a pretrained LLM, further improved by a scaffold-invariant data augmentation strategy to boost generalization and explainability. ReSS introduces quantitative metrics, including hallucination rate, explanation necessity, and explanation sufficiency, to rigorously assess faithfulness. Experimental results on medical and financial benchmarks demonstrate that ReSS-trained models improve upon traditional decision trees and standard fine-tuning approaches by up to 10% while producing faithful and consistent reasoning.

Key takeaway

For research scientists developing predictive models for high-stakes tabular data, ReSS offers a robust approach to achieve both high accuracy and verifiable, human-understandable reasoning. You should consider integrating symbolic scaffolds from decision trees to guide LLM fine-tuning, as this method significantly improves faithfulness and explainability compared to direct LLM fine-tuning or traditional tree-based models. This framework can enhance trust and transparency in critical applications like healthcare and finance.

Key insights

ReSS combines decision trees and LLMs to generate faithful, explainable reasoning for tabular data predictions.

Principles

Symbolic scaffolds guide LLM reasoning.
Data augmentation improves generalization.
Faithfulness metrics are crucial for evaluation.

Method

ReSS trains a decision tree to extract symbolic decision paths, which then guide an LLM to generate natural-language reasoning. This data fine-tunes an LLM, enhanced by scaffold-invariant data augmentation.

In practice

Apply ReSS to high-stakes tabular domains.
Use decision trees for initial logic extraction.
Evaluate reasoning with hallucination, sufficiency, and necessity metrics.

Topics

ReSS Framework
Tabular Data Prediction
Symbolic Scaffolds
Decision Trees
Large Language Models

Code references

huggingface/trl

Best for: Research Scientist, AI Scientist, Machine Learning Engineer, Data Scientist

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.