SenFlow: Inter-Sentence Flow Modeling for AI-Generated Text Detection in Hybrid Documents
Summary
SenFlow introduces a novel approach for sentence-level AI-generated text detection (S-AGTD) in hybrid documents, addressing limitations of existing methods that ignore inter-sentence dependencies and outdated benchmarks. The researchers developed MOSAIC, a new benchmark comprising 16,000 hybrid documents sourced from PubMed and XSum, generated by advanced LLMs like DeepSeek-V3.2 and Kimi K2, and filtered for perplexity consistency. SenFlow recasts S-AGTD as structured prediction, employing graph-based inter-sentence propagation combined with linear-chain CRF decoding in a single document-level pass. This method achieved state-of-the-art performance on MOSAIC, demonstrating a +4.15 pp average Macro-F1 margin in cross-domain transfer, the most challenging evaluation protocol. A significant finding is that even after perplexity filtering, AI-generated insertions exhibit a generator-dependent sentence-length disparity, which sentence-level detectors can still leverage for improved accuracy.
Key takeaway
For NLP Engineers developing AI-generated text detection systems, you should move beyond isolated sentence analysis. Incorporate inter-sentence flow modeling, like SenFlow's graph-based propagation and CRF decoding, to significantly improve accuracy in hybrid documents. Your benchmarks must include outputs from current LLMs such as DeepSeek-V3.2 and Kimi K2. Also, apply perplexity-consistency filters to ensure robustness against the latest generators. Consider analyzing subtle cues like sentence length variations.
Key insights
Detecting AI-generated text in hybrid documents requires modeling inter-sentence dependencies, not just isolated sentences, for improved accuracy.
Principles
- Inter-sentence dependencies improve S-AGTD.
- Benchmarks need current LLMs and quality filters.
- AI insertions show generator-specific length patterns.
Method
SenFlow models S-AGTD as structured prediction, using graph-based inter-sentence propagation and linear-chain CRF decoding in a single document-level pass over a sentence graph.
In practice
- Incorporate inter-sentence context for detection.
- Build benchmarks with recent LLMs and quality filters.
- Exploit generator-dependent sentence length gaps.
Topics
- AI-Generated Text Detection
- Hybrid Documents
- Inter-Sentence Flow Modeling
- Structured Prediction
- LLM Benchmarking
- DeepSeek-V3.2
- Kimi K2
Code references
Best for: Research Scientist, AI Scientist, NLP Engineer, Machine Learning Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Takara TLDR - Daily AI Papers.