GB-LSR: A Fast Local Spectral Image Representation with a Single Global Bandwidth for Continuous Reconstruction and Super-Resolution
Summary
GB-LSR (Global-Bandwidth Local Spectral Representation) is a novel fixed-grid local spectral representation designed for continuous image reconstruction and super-resolution. It partitions images into non-overlapping square patches, each utilizing coefficients for a truncated Fourier basis derived from shared convolutional-encoder features. A single trainable scalar bandwidth is applied globally across all patches and images, ensuring reconstruction cost remains independent of image size. In native-reconstruction benchmarks, GB-LSR's main variant surpasses amortized LIIF, LTE, and WIRE re-implementations by 2.8-3.6 dB PSNR and 0.11-0.15 LPIPS, while operating at approximately one-quarter of the slowest baseline's inference cost. For arbitrary-scale super-resolution (ASR), it achieves competitive PSNR-Y, running 1.44x faster than LIIF-RDN and 3.25x faster than LTE-SwinIR at x4, with further speedups and memory reductions possible by optimizing the encoder and averaging techniques.
Key takeaway
For Machine Learning Engineers developing continuous image reconstruction or super-resolution models, you should evaluate GB-LSR as a high-performance, efficient alternative. Its global scalar bandwidth approach significantly outperforms existing methods like LIIF and LTE in PSNR and LPIPS, while offering substantial inference speedups (up to 3.25x faster). Consider implementing this fixed-grid local spectral representation to reduce computational costs and improve model accuracy for your applications.
Key insights
A global scalar bandwidth in local spectral representations can achieve superior image reconstruction and super-resolution performance with high efficiency.
Principles
- Fixed-grid local spectral representations can achieve high performance.
- A single global bandwidth can suffice for image reconstruction.
- Inference cost can be independent of image size.
Method
GB-LSR partitions images into patches, predicts Fourier basis coefficients from shared convolutional-encoder features, and uses a single trainable global scalar bandwidth for continuous reconstruction.
In practice
- Implement a global scalar bandwidth for spectral image representations.
- Optimize encoder channels (e.g., 64 to 96) for speed/memory.
- Consider omitting 4-corner local-ensemble averaging for speed.
Topics
- Image Reconstruction
- Super-Resolution
- Local Spectral Representation
- Fourier Basis
- Convolutional Encoders
- Computer Vision
Best for: Research Scientist, AI Scientist, Machine Learning Engineer, Computer Vision Engineer
Related on AIssential
Editorial summary, takeaway, and curation by AIssential. Original article published by Computer Vision and Pattern Recognition.