Sharing is caring: data sharing in multi-agent supply chains

2026-03-02 · Source: cs.MA updates on arXiv.org · Field: Technology & Digital — Artificial Intelligence & Machine Learning, AI for Supply Chain Optimization · Depth: Advanced, extended

Summary

This research explores data sharing strategies in multi-agent supply chain networks, specifically focusing on a two-echelon system with a factory and a retailer agent. It investigates how different communication approaches—no information, lying, telling the truth, or a mixed strategy—impact system performance under high and low demand scenarios, both with baseline and collaborative reward shaping. The study uses a multi-agent system built on the Gymnasium APIs framework and Ray's multiagent tools, with agents trained using PyTorch on the IRIDIS supercomputer. Key findings indicate that data sharing significantly boosts performance, especially when combined with cooperative reward shaping. In high demand, lying by the factory yields a small overall system improvement (0.9%), primarily benefiting the factory (1711 reward). In low demand, telling the truth is most successful, increasing factory performance by 158% and retailer performance by 7.5%, leading to substantial benefits for all actors.

Key takeaway

For research scientists optimizing multi-agent supply chain models, you should recognize that a "one-size-fits-all" data sharing strategy is ineffective. In low-demand environments, prioritize implementing truthful information exchange, as it significantly benefits both factory and retailer agents. Conversely, in high-demand, factory-dominated scenarios, strategically consider allowing the factory agent to lie about inventory, as this can provide a small but measurable overall system improvement, even if it doesn't benefit all agents equally. Always align data sharing with demand context and reward structures.

Key insights

Optimal data sharing in multi-agent supply chains varies significantly with demand conditions and reward structures.

Principles

Data sharing boosts supply chain performance.
Collaborative rewards amplify data sharing effects.
Demand dynamics dictate optimal communication strategy.

Method

A two-echelon multi-agent supply chain environment was simulated using Gymnasium and Ray, where a factory agent could choose to share truth, lies, or no inventory data with a retailer under varying demand and reward conditions.

In practice

Implement truth-telling in low-demand supply chains.
Consider strategic lying in high-demand factory-dominated scenarios.
Integrate collaborative reward shaping for amplified benefits.

Topics

Multi-Agent Reinforcement Learning
Supply Chain Management
Data Sharing Strategies
Reward Shaping

Code references

wangwan0910/masc

Best for: Research Scientist, AI Researcher, AI Scientist, Machine Learning Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by cs.MA updates on arXiv.org.