BLINKG: A Benchmark for LLM-Integrated Knowledge Graph Generation

· Source: cs.AI updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics · Depth: Expert, extended

Summary

BLINKG is a new benchmark designed to evaluate Large Language Models (LLMs) for Knowledge Graph (KG) generation, specifically focusing on their ability to map heterogeneous data sources to ontology terms. This benchmark addresses the lack of standardized frameworks for assessing LLM effectiveness in KG construction, a task that can traditionally require six person-months of manual effort. BLINKG includes three progressively complex scenarios (Basic, Schema-aligned, Schema-distant) based on real-world use cases, supporting CSV, JSON, and XML inputs. An extensive evaluation of six state-of-the-art LLMs (DeepSeek-R1, Gemini 2.5 Pro, GPT-4o, OpenAI o3, LLaMa 3.3 70B Instruct, Mixtral 8x22B Instruct) using BLINKG shows promising solutions in simple scenarios but limited performance in complex cases, particularly for join conditions and transformation functions. The benchmark also defines requirements for (semi)automated LLM-driven KG construction.

Key takeaway

For knowledge engineers aiming to automate Knowledge Graph construction, you should recognize that current LLMs offer promising solutions for basic data-to-ontology mapping tasks. However, for complex scenarios involving schema-distant data or intricate join conditions, expect significant limitations. You will likely need to integrate LLMs within a human-in-the-loop workflow, combining their capabilities with symbolic reasoning or expert validation to ensure robust and semantically sound KGs.

Key insights

BLINKG benchmarks LLM capabilities in mapping diverse data to ontologies for Knowledge Graph generation.

Principles

Method

BLINKG evaluates LLMs across three scenarios (Basic, Schema-aligned, Schema-distant) using Precision, Recall, and F-score, enhanced by Levenshtein and cosine similarity checks against gold standards.

In practice

Topics

Code references

Best for: AI Scientist, Research Scientist, Machine Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.