\textsc{DiARC}: Distinguishing Positive and Negative Samples Helps Improving ARC-like Reasoning Ability of Large Language Models

2026-06-25 · Source: Artificial Intelligence · Field: Technology & Digital — Artificial Intelligence & Machine Learning · Depth: Expert, quick

Summary

DiARC is a novel method designed to enhance the Abstraction and Reasoning Corpus (ARC)-like reasoning capabilities of large language models (LLMs). Addressing the limitations of current LLM approaches, which are either unsatisfactory for open-source models or costly for closed-source ones, DiARC moves beyond traditional data augmentation. It posits that improving ARC-like problem-solving requires not only positive sample supervision but also the ability to distinguish negative samples. Drawing inspiration from preference alignment, DiARC constructs preference pairs. It introduces three specific techniques for generating negative samples: output-level visual transformations, DSL-level rule inversion, and task-specific rule editing. These methods create informative "near-miss" alternatives while preserving original demonstrations. Experimental results demonstrate that DiARC consistently improves performance across various ARC-like benchmarks. The project's code is publicly available at https://github.com/szu-tera/DiARC.

Key takeaway

For Research Scientists developing LLMs for complex reasoning tasks like those in the Abstraction and Reasoning Corpus, you should integrate negative sample distinction into your training methodology. Current data augmentation alone is insufficient. Instead, explore preference alignment techniques to enable your models to differentiate between correct and "near-miss" incorrect outputs. Implement output-level visual transformations, DSL-level rule inversion, or task-specific rule editing to construct informative negative samples. This approach consistently improves model performance.

Key insights

Distinguishing negative samples via preference alignment improves LLM reasoning on ARC-like tasks.

Principles

Reasoning improvement requires distinguishing negative samples.
Preference alignment can enhance model reasoning.

Method

DiARC constructs preference pairs by generating negative samples through output-level visual transformations, DSL-level rule inversion, or task-specific rule editing.

In practice

Apply preference alignment to improve LLM reasoning.
Generate near-miss negative samples for complex tasks.

Topics

Large Language Models
ARC Reasoning
Preference Alignment
Negative Sampling
Data Augmentation

Code references

szu-tera/DiARC

Best for: AI Scientist, Research Scientist, Machine Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence.