SenFlow: Inter-Sentence Flow Modeling for AI-Generated Text Detection in Hybrid Documents
Summary
SenFlow introduces a novel approach to Sentence-level AI-Generated Text Detection (S-AGTD) for hybrid documents, where human and LLM content are interleaved. It addresses limitations of prior methods that classify sentences in isolation and use outdated benchmarks. The system recasts S-AGTD as structured prediction, integrating graph-based inter-sentence propagation with linear-chain Conditional Random Field (CRF) decoding. Alongside SenFlow, the MOSAIC benchmark was constructed, comprising 16,000 hybrid documents from PubMed and XSum, generated by DeepSeek-V3.2 and Kimi K2, and filtered for perplexity consistency. SenFlow achieves state-of-the-art performance on MOSAIC, demonstrating a +4.15 pp average Macro-F1 margin on cross-domain transfer. A key finding is that even after perplexity equalization, AI insertions retain a generator-dependent sentence-length gap that S-AGTD can exploit.
Key takeaway
For Machine Learning Engineers developing AI text detectors, current sentence-level methods often fail on hybrid documents. You should integrate inter-sentence flow modeling and structured prediction, as demonstrated by SenFlow, to significantly improve detection accuracy and generalization across unseen domains and generators. This approach offers a robust solution for identifying interleaved human and AI content, even when perplexity cues are minimized, providing a more reliable tool for content authenticity.
Key insights
Modeling inter-sentence dependencies via structured prediction significantly enhances AI-generated text detection in hybrid documents.
Principles
- Authorship transitions often cluster in contiguous blocks.
- Semantic similarity reveals long-range authorship cues.
- Perplexity filtering does not eliminate all AI detection signals.
Method
SenFlow extracts features from a LoRA-aligned Llama-3.1-8B-Instruct proxy, fuses semantic and style embeddings using cross-attention, then refines representations with a Hybrid Adjacency GCN and decodes with a linear-chain CRF for joint sequence prediction.
In practice
- Integrate graph-based inter-sentence modeling for S-AGTD.
- Construct benchmarks with perplexity-consistency filters.
- Analyze sentence length as a detection feature.
Topics
- AI-Generated Text Detection
- Hybrid Documents
- Sentence-Level Detection
- Graph Neural Networks
- Conditional Random Fields
- LLM Benchmarking
- DeepSeek-V3.2
Code references
Best for: Research Scientist, AI Scientist, Machine Learning Engineer, NLP Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by cs.CL updates on arXiv.org.