Knowledge Graph and Hypergraph Transformers with Repository-Attention and Journey-Based Role Transport

· Source: cs.LG updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics · Depth: Expert, medium

Summary

A new architecture is proposed for jointly training language models on sentences and structured data, specifically knowledge graphs and hypergraphs, while maintaining a clear separation between knowledge and language representations. This model encodes structured data into a key-value repository that a language transformer can attend over. The attention mechanism is conditioned by "journey-based role transport," which unifies edge-labeled KG traversal, hyperedge traversal, and sentence structure. The architecture features a dual-stream design, hierarchical layer groups for instance-local, neighborhood, and global mixing attention, and retrieval over a separate repository. It supports multi-task objectives including masked language modeling, link prediction, and role-consistency denoising, enabling explicit, inspectable separation between linguistic context and structured knowledge through cross-attention.

Key takeaway

For research scientists developing advanced NLP models, this architecture offers a method to integrate structured knowledge graphs and hypergraphs with language models more effectively. Your models can achieve explicit separation of knowledge and language, making knowledge more inspectable and updateable. Consider implementing journey-based role transport to unify diverse data structures and improve reasoning over complex, multi-modal data.

Key insights

A novel architecture separates structured knowledge from language in transformers using a key-value repository and journey-based role transport.

Principles

Method

Structured instances are encoded into a key-value repository. A language transformer attends to this repository using journey-based role transport, which generalizes positional embeddings to arbitrary roles and instances, enabling joint training via cross-attention.

In practice

Topics

Best for: Research Scientist, AI Researcher, AI Scientist, Machine Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.LG updates on arXiv.org.