Enhancing Unsupervised Keyword Extraction in Academic Papers through Integrating Highlights with Abstract
Summary
This paper explores enhancing unsupervised keyword extraction from academic papers by integrating the highlights section with the abstract. While prior research primarily used abstracts and references, this study focuses on highlights, which summarize key findings and contributions. Observations indicate highlights offer valuable keyword information that complements abstracts. The research evaluates three input scenarios: abstract only, highlights only, and a combination of both, using four unsupervised models on Computer Science (CS) and Library and Information Science (LIS) datasets. Experiments demonstrate that combining abstracts with highlights significantly improves keyword extraction performance. The study also examines how differences in keyword coverage and content between abstracts and highlights influence extraction outcomes. Data and code are publicly available.
Key takeaway
For research scientists developing information retrieval systems for academic literature, you should consider integrating the "highlights" section of papers with their abstracts. This approach has been shown to significantly improve unsupervised keyword extraction performance, leading to more accurate and comprehensive indexing. Incorporating this combined input could enhance the discoverability and utility of research papers within your systems.
Key insights
Integrating academic paper highlights with abstracts significantly improves unsupervised keyword extraction performance.
Principles
- Highlights complement abstracts for keyword extraction.
- Combined inputs outperform single inputs for keyword extraction.
Method
Evaluated three input scenarios (abstract, highlights, combined) with four unsupervised models on CS and LIS datasets to measure keyword extraction performance.
In practice
- Combine abstract and highlights for better keyword extraction.
- Apply unsupervised models to academic paper analysis.
Topics
- Keyword Extraction
- Academic Papers
- Highlights Section
- Abstract Integration
- Unsupervised Models
Code references
Best for: Research Scientist, AI Scientist, Machine Learning Engineer, NLP Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Computation and Language.