Synthetic Benchmarks Overstate Forward-Forward Scaling: Real-Data Limits of Layer-Local Training

2026-06-04 · Source: Computer Vision and Pattern Recognition · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Computer Vision & Pattern Recognition, Neural and Evolutionary Computing · Depth: Expert, quick

Summary

DTG-FF, a novel Forward-Forward (FF) learning instrument, establishes a new performance benchmark for FF-family models across nine real-data benchmarks, achieving 91.8% on CIFAR-10 and the first FF baseline at ImageNet-100 224x224. Despite these advancements, a rigorous audit using DTG-FF reveals significant scaling limitations for layer-local training on real-world data. An architecture-matched BP-DeepSup baseline surpasses DTG-FF by 2.40/5.93 percentage points on CIFAR-10/CIFAR-100, with the gap widening with class count. At 224x224 resolution, DTG-FF reaches only 49.4%, substantially below typical BP performance exceeding 75%, exposing a real-data ceiling. The study also identifies a "K-conflict," where synthetic benchmarks overstate FF's transferability. A systems audit on 8 GB hardware shows DTG-FF consumes 7.90 GB and processes 138 images/second, while BP+gradient-accumulation uses 4.18 GB and 157 images/second, challenging FF's memory efficiency claims at this scale.

Key takeaway

For Machine Learning Engineers evaluating layer-local training methods like Forward-Forward for large-scale computer vision tasks, you should recognize their current limitations. This research indicates that FF models, despite recent improvements, do not scale effectively on real-world data beyond 32x32 resolutions and offer no memory advantage over standard backpropagation with gradient accumulation on commodity 8 GB hardware. Prioritize backpropagation for robust performance and efficient resource utilization in production systems.

Key insights

Forward-Forward learning's real-data scaling and memory efficiency are significantly overstated by synthetic benchmarks, revealing limitations compared to backpropagation.

Principles

Synthetic benchmarks can overstate model transferability.
Real-data scaling limits are invisible at small resolutions.
Memory efficiency claims require fair, real-world baselines.

Method

DTG-FF integrates dynamic temperature goodness, decoupled normalization, and multi-layer fusion to enhance Forward-Forward learning, enabling rigorous auditing of its real-data scaling limits.

In practice

Use 224x224 ImageNet-100 for FF scaling audits.
Avoid synthetic K-sweeps for real-world transferability.
Benchmark FF memory against BP+gradient-accumulation.

Topics

Forward-Forward Learning
Layer-Local Training
Real-Data Benchmarks
Backpropagation
Model Scaling
Deep Learning Systems

Best for: Research Scientist, Computer Vision Engineer, AI Scientist, Machine Learning Engineer

Related on AIssential

See Counsel's argued verdicts on the open AI decisions leaders are weighing →

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Computer Vision and Pattern Recognition.