SenFlow: Inter-Sentence Flow Modeling for AI-Generated Text Detection in Hybrid Documents

2026-06-17 · Source: Computation and Language · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics, Natural Language Processing · Depth: Expert, quick

Summary

SenFlow is a novel method designed for sentence-level AI-generated text detection (S-AGTD) in hybrid documents, where human and LLM content co-exist. It addresses limitations of prior approaches that classify sentences in isolation and use outdated benchmarks. To facilitate this, the authors constructed MOSAIC, a new benchmark comprising 16,000 hybrid documents derived from PubMed and XSum, generated by DeepSeek-V3.2 and Kimi K2. MOSAIC incorporates a stringent perplexity-consistency filter, a feature absent in previous benchmarks. SenFlow recasts S-AGTD as a structured prediction problem over document sentence sequences, integrating graph-based inter-sentence propagation with linear-chain CRF decoding in a single document-level pass. This approach achieves state-of-the-art performance on MOSAIC, demonstrating a +4.15 pp average Macro-F1 margin on cross-domain transfer, the most challenging protocol. The research also found that even with perplexity filtering, AI insertions maintain a generator-dependent sentence-length gap.

Key takeaway

For NLP engineers developing AI-generated text detection systems, recognize that isolated sentence analysis is insufficient for hybrid documents. Your models should incorporate inter-sentence dependencies, as demonstrated by SenFlow's structured prediction approach, to achieve higher accuracy. Consider evaluating your detectors against the MOSAIC benchmark, which includes recent LLM outputs and a perplexity-consistency filter, to ensure robustness against modern generators.

Key insights

SenFlow improves AI-generated text detection in hybrid documents by modeling inter-sentence dependencies and using a new benchmark.

Principles

Inter-sentence dependencies are crucial for S-AGTD.
Structured prediction enhances sentence-level detection.
Perplexity filters don't eliminate all AI cues.

Method

SenFlow recasts S-AGTD as structured prediction, integrating graph-based inter-sentence propagation with linear-chain CRF decoding in a single document-level pass over a sentence graph.

In practice

Use MOSAIC benchmark for S-AGTD evaluation.
Consider inter-sentence context for detection.
Analyze sentence length as an AI insertion cue.

Topics

AI-Generated Text Detection
Hybrid Documents
Structured Prediction
Inter-Sentence Dependencies
MOSAIC Benchmark
Large Language Models

Code references

luojingkun22/SenFlow

Best for: Research Scientist, AI Scientist, NLP Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Computation and Language.