HybridCodeAuthorship: A Benchmark Dataset for Line-Level Code Authorship Detection
Summary
HybridCodeAuthorship is a novel benchmark dataset introduced to address the growing challenge of detecting AI-generated code within industry codebases that increasingly blend human and AI contributions. Unlike existing benchmarks, which often feature academic problems or assume entire code snippets are either human- or AI-authored, HybridCodeAuthorship provides Python code files with authentically interleaved human- and AI-authored lines. Constructed using a pipeline that leverages CodeSearchNet, this dataset simulates real-world AI code assistant usage. Initial benchmarking with state-of-the-art algorithms, including AIGCode Detector, revealed it is a challenging benchmark, with the top algorithm achieving F1 scores of 0.48 for chunk-level and 0.56 for line-level code detection tasks.
Key takeaway
For Machine Learning Engineers developing code authorship detection systems, you should integrate the HybridCodeAuthorship benchmark into your evaluation pipeline. This dataset offers a more realistic assessment of algorithm performance on interleaved human- and AI-generated code, reflecting actual industry usage of AI assistants. Recognizing the current F1 scores of 0.48 to 0.56, focus your research on improving fine-grained detection capabilities to meet practical risk management and productivity analysis needs.
Key insights
Industry codebases require fine-grained, line-level detection of AI-generated code for risk and productivity analysis.
Principles
- Existing code authorship benchmarks are insufficient for hybrid AI/human code.
- Authentic AI assistant usage results in interleaved human and AI code.
- Line-level AI code detection remains a challenging task.
Method
A dataset construction pipeline leverages CodeSearchNet to create Python files with interleaved human- and AI-authored lines, simulating real-world AI assistant usage.
In practice
- Evaluate AI code detection algorithms on realistic hybrid code scenarios.
- Develop new algorithms for line-level AI code authorship detection.
Topics
- HybridCodeAuthorship
- Code Authorship Detection
- AI Code Assistants
- Benchmark Datasets
- Python
- CodeSearchNet
Best for: Research Scientist, CTO, VP of Engineering/Data, AI Scientist, Machine Learning Engineer, Director of AI/ML
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence.