Adapting Diffusion Language Models for Lossless Pixel-Level Image Transmission
Summary
The paper introduces DDM-SSCC, a discrete-diffusion-model-based separate source-channel coding framework designed for lossless pixel-level image transmission. This approach addresses the limitations of semantic communications, which often incur reconstruction errors, by aiming for exact image recovery crucial in fidelity-sensitive applications like remote medical imaging. DDM-SSCC adapts a diffusion language model for pixel-token restoration, employing synchronized reverse arithmetic coding with bidirectional attention. Key innovations include a Halton-guided denoising order for improved spatial coverage, a mask-ratio-aware cosine schedule to adapt the denoising pace, and a lightweight temperature calibration module for probability tables. Experiments on CIFAR10, DIV2K-LR-X4, and Kodak datasets under AWGN and Rayleigh fading channels show DDM-SSCC achieves superior exact-recovery performance, reaching perfect reconstruction above 2 dB unified SNR, outperforming representative lossless and semantic communication baselines.
Key takeaway
For Machine Learning Engineers building robust image transmission systems, DDM-SSCC enables pixel-level lossless recovery, vital for fields like remote medical imaging. You should integrate discrete diffusion models with synchronized arithmetic coding. Leverage Halton-guided denoising and mask-ratio-aware schedules to boost reliability and compression in noisy channels. This approach consistently outperforms other baselines for exact image reconstruction.
Key insights
DDM-SSCC enables lossless pixel-level image transmission by adapting discrete diffusion models for synchronized arithmetic coding.
Principles
- Bidirectional attention improves pixel-level context.
- Low-discrepancy denoising enhances spatial coverage.
- Adaptive schedules optimize denoising pace.
Method
DDM-SSCC uses a synchronized discrete-diffusion source coding protocol. It starts from a fully masked sequence, progressively restoring tokens via a Halton-guided denoising order, a cosine schedule for token processing, and mask-ratio-aware temperature calibration for probability tables.
In practice
- Apply Halton sequences for deterministic, spatially uniform sampling.
- Implement cosine schedules for adaptive denoising step sizes.
- Use temperature scaling to calibrate model probabilities for arithmetic coding.
Topics
- Lossless Image Transmission
- Discrete Diffusion Models
- Source-Channel Coding
- Arithmetic Coding
- Halton Sequences
- Image Compression
Best for: Computer Vision Engineer, Research Scientist, AI Scientist, Machine Learning Engineer, AI Architect
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by cs.AI updates on arXiv.org.