Is It Novel and Why? Fine-Grained Patent Novelty Prediction Based on Passage Retrieval

· Source: Computation and Language · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Intellectual Property & Patents · Depth: Advanced, quick

Summary

A new dataset, FiNE-Patents (Fine-grained Novelty Examination of Patents), has been introduced to improve patent novelty assessment. This dataset contains 3,658 first patent claims, each annotated with feature-level prior art references derived from European Search Opinion (ESOP) documents. The research proposes a shift from traditional claim-level binary classification to a joint retrieval and abstract reasoning task at the feature level. This approach requires models to identify specific passages in prior art that disclose individual claim features and to pinpoint which features contribute to a claim's novelty. LLM-based workflows were implemented and evaluated, demonstrating superior performance over embedding-based baselines in passage retrieval and novel feature identification. These LLM workflows also exhibit robustness against spurious correlations often found in claim-level novelty classification tasks.

Key takeaway

For research scientists developing patent examination tools, you should consider adopting a feature-level analysis paradigm. This approach, supported by the FiNE-Patents dataset and LLM-based workflows, offers more granular insights into novelty and reduces susceptibility to spurious correlations, leading to more accurate and transparent assessments.

Key insights

Fine-grained, feature-level analysis improves patent novelty prediction and mitigates spurious correlations.

Principles

Method

Decompose patent claims into features, analyze each feature against prior art using LLMs, then derive a claim-level novelty prediction.

In practice

Topics

Best for: Research Scientist, AI Scientist, Machine Learning Engineer, Legal Professional

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Computation and Language.