When Correct Edges Cannot Be Verified: A Provenance Gap in Incomplete KGQA and a Provenance-Favoring Completion Policy

· Source: Computation and Language · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics · Depth: Expert, quick

Summary

A study on Incomplete Knowledge Graph Question Answering (IKGQA) identifies a significant "provenance gap," revealing that textual verifiability often fails to confirm the correctness of completed edges. Researchers found that 76-96% of gold-correct completed edges, across datasets like CWQ and WebQSP, deletion rates of 20% and 40%, and various relation types, lack supporting passages even with exhaustive retrieval. This indicates textual faithfulness measures provenance, not correctness. Critically, 95-97% of correct answers in IKGQA do not rely on unsupported edges. This reframes edge completion, shifting focus to "admit or abstain under provenance uncertainty." The paper introduces TGComplete, a provenance-favoring admission policy. TGComplete achieves 15-21% higher edge precision against gold compared to the GoG baseline (3-14%), with 3.1-7.4 times higher strict faithfulness of admitted edges and no statistically detectable EM loss, positioning it for applications prioritizing auditability.

Key takeaway

For Machine Learning Engineers developing Incomplete Knowledge Graph Question Answering systems, recognize that relying solely on textual support for edge verification introduces a significant "provenance gap," not a correctness guarantee. Your systems should prioritize explicit provenance-favoring admission policies like TGComplete, especially when auditability is critical. Be prepared for a trade-off, as higher precision and faithfulness in admitted edges may come at the cost of lower recall, requiring careful balancing for your specific application.

Key insights

Textual verifiability in IKGQA measures provenance, not correctness, due to a significant "provenance gap" in textual support for gold-correct edges.

Principles

Method

TGComplete retrieves evidence at a reasoning breakpoint, verifies candidates via a lightweight loop, and abstains when textual support is absent.

In practice

Topics

Best for: Research Scientist, AI Scientist, Machine Learning Engineer, NLP Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Computation and Language.