Reliable Reasoning with Large Language Models via Preference-Based Maximum Satisfiability

· Source: Artificial Intelligence · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Logic in Computer Science · Depth: Expert, quick

Summary

A novel hybrid reasoning approach enhances Large Language Models' (LLMs) ability to handle optimization tasks with multiple constraints and user-defined preferences, particularly in domains like robotics. This method involves an LLM generating Python code from a natural language problem description, encoding constraints and preferences into a preference-based Maximum Satisfiability (MaxSAT) problem. An exact MaxSAT solver then processes this problem. To ensure accuracy, the solutions from the model-generated code undergo independent verification for feasibility and optimality against a canonical MaxSAT encoding. Evaluations using both open-source and closed-access LLMs on three families of preference-based reasoning tasks show acceptance rates exceeding 80%, significantly outperforming direct-answer, chain-of-thought, and program-of-thought baselines, which often fail to produce feasible solutions.

Key takeaway

For AI Scientists and Machine Learning Engineers designing LLM-based optimization systems, consider integrating a hybrid approach where your LLM generates preference-based MaxSAT code. This method significantly boosts solution feasibility and correctness, achieving over 80% acceptance rates. Implement independent verification of solver outputs against canonical encodings. This ensures robust, verifiable optimization for complex constraint satisfaction problems, particularly in robotics.

Key insights

LLMs can achieve reliable, verifiable optimization by externalizing reasoning into MaxSAT code generation and solver-based verification.

Principles

Method

LLMs generate Python code encoding natural language problems into preference-based MaxSAT. An exact MaxSAT solver finds solutions, which are then independently verified for feasibility and optimality against a canonical encoding.

In practice

Topics

Best for: Research Scientist, AI Scientist, Machine Learning Engineer, NLP Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence.