Humans' ALMANAC: A Human Collaboration Dataset of Action-Level Mental Model Annotations for Agent Collaboration

· Source: Artificial Intelligence · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Data Science & Analytics · Depth: Expert, quick

Summary

The ALMANAC dataset, introduced to enhance LLM agent collaboration, addresses a critical gap where current agents, optimized for task completion, lack the ability to continuously align mental models with human partners. Derived from the classic Map Task, a dyadic routing exercise, ALMANAC comprises 2,987 collaboration actions. Each action is meticulously annotated with theory-informed mental models, detailing participants' self-reasoning, perceived partner intent, and perceived team goal. This novel dataset aims to guide agents toward process-level collaborative competence, moving beyond mere task completion. Researchers benchmarked six LLMs using ALMANAC, demonstrating its effectiveness in evaluating models' capacity to simulate human collaborative behaviors and accurately infer their underlying mental models, thereby fostering more effective human-agent partnerships.

Key takeaway

For Machine Learning Engineers developing collaborative LLM agents, ALMANAC offers a vital resource to move beyond task-centric optimization. You should integrate this dataset to train and evaluate your models on process-level collaborative competence, specifically focusing on aligning mental models, understanding partner intent, and shared goals. Utilizing ALMANAC will enable you to develop agents that can simulate human collaborative behaviors more accurately and infer underlying mental models, fostering more effective and nuanced human-agent partnerships in complex tasks.

Key insights

ALMANAC provides action-level mental model annotations to train LLM agents for human-like collaboration.

Principles

Method

ALMANAC was built from the Map Task, collecting 2,987 dyadic routing actions. Each action received theory-informed annotations for self-reasoning, perceived partner intent, and team goal.

In practice

Topics

Best for: Research Scientist, AI Scientist, Machine Learning Engineer, NLP Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence.