Learned Non-Maximum Suppression for 3D Object Detection

2026-06-02 · Source: Artificial Intelligence · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Robotics & Autonomous Systems · Depth: Expert, quick

Summary

This work introduces two learned filtering modules, D2D-Rescore and GossipNet3D, designed to replace heuristic non-maximum suppression (NMS) in LiDAR-based 3D object detection. D2D-Rescore utilizes transformer-based detection-to-detection (D2D) attention, while GossipNet3D adapts the 2D GossipNet concept for 3D through localized message passing in a bird's-eye view. A key component is a metric-aware matching strategy, aligned with the nuScenes evaluation protocol, which ensures consistent training and validation. Both D2D-Rescore and GossipNet3D significantly improve mean average precision (mAP), nuScenes detection score (NDS), and true positive quality compared to CircleNMS, particularly benefiting small and infrequent object classes. These enhancements are achieved with minimal computational overhead, demonstrating that learned, detection-level filtering can boost 3D detector reliability without altering the base network.

Key takeaway

For Machine Learning Engineers optimizing LiDAR-based 3D object detection, consider integrating learned NMS modules like D2D-Rescore or GossipNet3D. These methods offer a principled alternative to heuristic suppression, significantly boosting mean average precision and nuScenes detection score, particularly for small or infrequent objects. You can enhance detector reliability without modifying your base network, providing a direct path to performance gains in critical perception systems.

Key insights

Learned filtering modules, D2D-Rescore and GossipNet3D, enhance 3D object detection by replacing heuristic NMS with detection-to-detection relation leveraging.

Principles

Learned filtering improves 3D detection reliability.
Relational reasoning among detections is effective.
Metric-aware matching aligns training with evaluation.

Method

D2D-Rescore uses transformer-based D2D attention; GossipNet3D adapts 2D GossipNet to 3D via localized message passing in bird's-eye view.

In practice

Apply learned NMS post-processing.
Improve detection for small, infrequent objects.
Integrate with existing 3D base networks.

Topics

3D Object Detection
Non-Maximum Suppression
LiDAR Perception
D2D-Rescore
GossipNet3D
nuScenes Benchmark

Code references

rst-tu-dortmund/learned-3d-nms

Best for: Research Scientist, AI Scientist, Machine Learning Engineer, Computer Vision Engineer

Related on AIssential

See Counsel's argued verdicts on the open AI decisions leaders are weighing →

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Artificial Intelligence.