MNet++: Extended 2D/3D Networks for Anisotropic Medical Image Segmentation

· Source: Computer Vision and Pattern Recognition · Field: Technology & Digital — Artificial Intelligence & Machine Learning, Computer Vision & Pattern Recognition · Depth: Expert, quick

Summary

MNet++, an extended hybrid 2D/3D convolutional network, is introduced for anisotropic medical image segmentation, building upon the original MNet architecture. The MNet was re-implemented within the nnU-Net framework, achieving a Dice similarity coefficient (DSC) of 89.0 +/- 0.9% on PROMISE prostate MRI, closely matching published results. On LiTS liver CT, it scored 94.3 +/- 1.9% for liver and 54.6 +/- 3.1% for tumor segmentation. Two lightweight extensions were developed: a learned Fusion Gating mechanism for adaptive 2D-3D feature blending, and a VMamba state-space module for efficient long-range depth modeling. The Spatial Gating variant improved DSC by +0.8% with less than 3% inference overhead, while VMamba enhanced performance consistency, reducing PROMISE Dice variation to +/- 0.7% and achieving 95.8% Dice for LiTS liver. Both extensions maintained MNet's robustness to anisotropy, showing a delta Dice of 1.5% across 1-4 mm voxel spacing.

Key takeaway

For Machine Learning Engineers developing medical image segmentation models, consider integrating adaptive 2D-3D feature fusion or state-space modules like VMamba into your architectures. These extensions, demonstrated by MNet++, can significantly improve Dice similarity coefficients and enhance performance consistency, particularly when dealing with anisotropic voxel spacing. Your models will gain robustness, reducing performance variation across diverse medical imaging datasets.

Key insights

Adaptive 2D-3D feature fusion and state-space models enhance MNet's anisotropic medical image segmentation reliability and reproducibility.

Principles

Method

The work involves re-implementing MNet in nnU-Net, then integrating a learned Fusion Gating mechanism and a VMamba state-space module for extended 2D-3D feature processing and long-range depth modeling.

In practice

Topics

Best for: Research Scientist, AI Scientist, Machine Learning Engineer, Computer Vision Engineer

Related on AIssential

Open in AIssential →

Editorial summary, takeaway, and curation by AIssential. Original article published by Computer Vision and Pattern Recognition.