Code Reasoning for Software Engineering Tasks: A Survey and A Call to Action

· Source: cs.SE updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Software Development & Engineering, Robotics & Autonomous Systems · Depth: Expert, extended

Summary

The paper "Code Reasoning for Software Engineering Tasks: A Survey and A Call to Action" presents the first dedicated survey on code reasoning techniques for software engineering (SWE) tasks. It examines how large language models (LLMs) perform complex tasks like code generation, translation, summarization, and repair, particularly for real-world GitHub issue resolution. The survey introduces a taxonomy of techniques, including Code Chain-of-Thought (CoT) reasoning, execution-based reasoning, and inference scaling, alongside a focus on agentic and non-agentic SWE tasks. It also provides a comprehensive overview of performance on common benchmarks like APPS, HumanEval, MBPP, and SWE-bench, highlighting under-explored benchmarks and future research gaps.

Key takeaway

For AI Scientists and Machine Learning Engineers developing code-generating LLMs, prioritize integrating modular Chain-of-Thought (CoT) prompting with execution-based feedback and inference scaling techniques. This hybrid approach, especially within agentic frameworks, demonstrably improves performance on complex software engineering tasks like GitHub issue resolution and competitive programming benchmarks. Focus on multilingual reasoning and exploring code-specific plans for agents to address current limitations and enhance generalizability.

Key insights

LLM code reasoning improves through structured CoT, execution feedback, and inference scaling, culminating in agentic systems.

Principles

Method

The paper categorizes code reasoning into Code CoT (plan-based, structure-based, fine-tuning), execution-based (self-evaluation, training with feedback, automated test generation), and inference scaling (sampling, search).

In practice

Topics

Code references

Best for: Research Scientist, AI Scientist, Machine Learning Engineer, AI Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.SE updates on arXiv.org.