TRN-R1-Zero: Text-rich Network Reasoning via LLMs with Reinforcement Learning Only

· Source: Computation and Language · Field: Technology & Digital — Artificial Intelligence & Machine Learning · Depth: Expert, quick

Summary

TRN-R1-Zero is a novel post-training framework designed for zero-shot reasoning on text-rich networks (TRNs), which integrates textual semantics with relational structure without requiring task-specific supervision. Unlike traditional graph neural networks or prior large language model (LLM)-based methods that often overlook graph context or rely on distillation, TRN-R1-Zero is trained exclusively via reinforcement learning. It optimizes base LLMs using a Neighbour-aware Group Relative Policy Optimisation objective, which dynamically adjusts rewards based on a new margin gain metric to assess the informativeness of neighboring signals, thereby guiding relational reasoning. This framework eliminates the need for supervised fine-tuning or chain-of-thought data from larger reasoning models. Experiments on citation, hyperlink, social, and co-purchase TRN benchmarks demonstrate its superior performance and robustness, achieving zero-shot inference on edge- and graph-level tasks through node-level training.

Key takeaway

For research scientists developing LLM-based reasoning systems, TRN-R1-Zero offers a method to achieve zero-shot performance on text-rich networks without extensive supervised data. You should consider integrating reinforcement learning with dynamic reward mechanisms to enhance relational reasoning in your models, particularly for tasks requiring cross-domain transfer or inference on unseen graph structures.

Key insights

TRN-R1-Zero enables zero-shot reasoning on text-rich networks using reinforcement learning without supervised fine-tuning.

Principles

Method

TRN-R1-Zero optimizes base LLMs with a Neighbour-aware Group Relative Policy Optimisation objective, using a margin gain metric to dynamically adjust rewards based on neighboring signal informativeness.

In practice

Topics

Code references

Best for: Research Scientist, AI Scientist, Machine Learning Engineer, NLP Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Computation and Language.