Wavelet Transform and JPEG 2000 - Multi-Resolution Analysis for Image Compression
Wavelet Transform Fundamentals - Simultaneous Time-Frequency Analysis
Wavelet Transform analyzes signals at different scales (resolutions) and positions simultaneously. While Fourier transform extracts frequency components from entire signals, wavelet transform identifies where specific scale components exist. In image compression, this multi-resolution analysis property enables efficient coding strategies.
Difference from Fourier transform:
Fourier transform completely loses spatial information, providing only frequency data. Short-Time Fourier Transform (STFT) localizes with window functions but fixed window size creates time-frequency resolution tradeoffs. Wavelet transform automatically applies narrow windows (high time resolution) for high frequencies and wide windows (high frequency resolution) for low frequencies, optimizing this tradeoff.
Mother wavelet:
Wavelet transform basis functions are called mother wavelets psi(t), generating analysis function families through scaling (dilation) and shifting (translation). Form: psi_a,b(t) = (1/sqrt(a)) x psi((t-b)/a) where a is scale and b is position parameter. Image compression widely uses Daubechies wavelets and CDF 9/7 wavelets.
2D wavelet transform:
Image application sequentially applies 1D wavelet transforms along rows then columns. One decomposition level splits images into 4 subbands: LL (low-low), LH (low-high: horizontal edges), HL (high-low: vertical edges), HH (high-high: diagonal edges). Further decomposing LL produces multi-resolution representation.
Discrete Wavelet Transform (DWT) and Filter Banks
Discrete Wavelet Transform (DWT) is efficiently implemented using filter bank structures. Low-pass and high-pass filter pairs decompose signals, with downsampling achieving multi-resolution decomposition at each level.
Analysis filter bank:
Input signals pass through low-pass filter h0 (corresponding to scaling function) and high-pass filter h1 (corresponding to wavelet function), each followed by 2x downsampling. Low-pass output is approximation coefficients; high-pass output is detail coefficients. Further decomposing approximation enables multi-level decomposition.
Synthesis filter bank:
Inverse transform upsamples each subband by 2x, passes through synthesis filters g0, g1, and sums. Filter designs satisfying perfect reconstruction conditions recover original signals without information loss - the foundation for lossless compression.
CDF 9/7 wavelet:
JPEG 2000's lossy compression uses Cohen-Daubechies-Feauveau 9/7 wavelet with 9-tap low-pass and 7-tap high-pass filters. Symmetric biorthogonal filters with linear phase characteristics minimize edge artifacts. Lifting scheme implementation reduces multiplication count for fast computation.
CDF 5/3 wavelet:
JPEG 2000's lossless compression uses CDF 5/3 wavelet implementable with integer arithmetic only. Rational filter coefficients guarantee perfect reconstruction without rounding errors. Lower compression ratio than CDF 9/7 but suitable for medical imaging and archival where information loss is unacceptable.
JPEG 2000 Compression Pipeline
JPEG 2000 is a wavelet-based image compression standard, standardized as ISO/IEC 15444 in 2000. Compared to conventional JPEG (DCT-based), it offers improved low-bitrate quality, unified lossless/lossy framework, and progressive transmission capabilities.
Overall compression flow:
1. Preprocessing (DC level shift, color space conversion RGB to YCbCr). 2. Tile division (optional, splitting large images into independently processable tiles). 3. DWT (CDF 9/7 or CDF 5/3, 5-6 level decomposition). 4. Quantization (dividing subband coefficients by step size and rounding). 5. EBCOT coding (bit-plane coding + arithmetic coding). 6. Packetization and codestream generation.
Tile division:
Large images (satellite, medical) are divided into independently processed tiles, typically 256x256 to 1024x1024 pixels. Tile boundary artifacts are mitigated through overlap or boundary processing. Tile-free processing is possible - smaller images achieve better compression without tiling.
Quantization:
DWT coefficient quantization reduces information. Different step sizes Delta per subband quantize coefficients as q = sign(c) x floor(|c|/Delta). Larger step sizes increase compression but reduce quality. Lossless mode skips quantization (Delta=1), coding coefficients directly.
Comparison with JPEG:
- Block artifacts: JPEG's 8x8 block DCT causes visible boundaries at low bitrates. JPEG 2000's overlapping wavelet bases eliminate block noise
- Compression efficiency: At equal PSNR, JPEG 2000 achieves equivalent quality at 20-30% lower bitrate
- Progressive display: JPEG 2000 natively supports quality, resolution, and spatial progressive transmission
EBCOT - Bit-Plane Coding Mechanism
EBCOT (Embedded Block Coding with Optimized Truncation) is JPEG 2000's core coding algorithm. Dividing subbands into code blocks (typically 64x64) and coding bit-plane by bit-plane enables truncation at arbitrary bitrates for flexible rate control.
Bit-plane coding:
Quantized coefficients are coded one bit at a time from MSB (most significant bit) toward LSB (least significant bit). First bit-planes contain most important information (large coefficient signs and positions); later planes add detail. Terminating at any bit-plane enables progressive quality control.
Three coding passes:
Each bit-plane is coded in 3 passes: 1. Significance Propagation Pass: codes bits surrounding adjacent significant coefficients. 2. Magnitude Refinement Pass: codes additional bits of already-significant coefficients. 3. Cleanup Pass: codes all remaining bits. This division ensures higher-importance information is coded first.
MQ arithmetic coding:
Each pass output is compressed by MQ coder (arithmetic encoder). Context-adaptive arithmetic coding updates probability models based on surrounding bit patterns. 18 contexts are defined, efficiently exploiting spatial coefficient correlation.
Rate control:
EBCOT's greatest advantage is optimal post-encoding truncation to arbitrary bitrates. Rate-Distortion optimization per code block per pass selects pass combinations achieving minimum distortion within given bit budgets. This enables precise bitrate control during encoding.
JPEG 2000 Applications and Current Position
JPEG 2000 is technically superior but has limited Web adoption. However, it's widely adopted as standard in specific professional fields. This section covers applications and positioning within the current image compression ecosystem.
Digital Cinema (DCI):
DCI (Digital Cinema Initiatives) movie distribution standard adopts JPEG 2000 as the sole compression format. Each 4K (4096x2160) frame is JPEG 2000 compressed in MXF containers. 12-bit color depth in XYZ color space requires high-quality compression at approximately 1.3MB per frame (250Mbps at 24fps).
Medical imaging (DICOM):
DICOM medical imaging standard widely uses JPEG 2000 lossless compression. CT, MRI, and X-ray images where diagnostic information loss is unacceptable achieve 2-3x file size reduction through lossless compression. 16-bit image support is also a critical requirement.
Satellite imagery and geospatial:
Satellite and aerial image distribution widely uses JPEG 2000. Progressive transmission of enormous images (tens of thousands of pixels) and ROI (Region of Interest) coding for partial high-quality access suit geographic information system (GIS) usage.
Web positioning:
Browser JPEG 2000 support is Safari-only (as of 2023) - Chrome and Firefox don't support it. AVIF (AV1-based) and WebP cover most JPEG 2000 technical advantages for web with broader browser support, making AVIF/WebP recommended for web. JPEG 2000 is positioned as a specialized professional-domain standard.
Implementation and Usage - OpenJPEG and Python Processing
Covers JPEG 2000 implementation libraries and Python read/write/conversion methods. Practical usage patterns and wavelet transform implementation for hands-on development.
OpenJPEG library:
OpenJPEG is JPEG 2000's open-source reference implementation supporting both encoding and decoding. Implemented in C, it serves as backend for many image processing tools (GDAL, ImageMagick, Pillow). Command-line tools opj_compress and opj_decompress enable direct conversion.
Python JPEG 2000 processing:
Pillow (PIL) supports JPEG 2000 read/write. img.save('output.jp2', 'JPEG2000', quality_mode='rates', quality_layers=[20]) encodes at 20:1 compression ratio. Glymur library provides detailed JPEG 2000 parameter control including tile size, decomposition levels, and code block sizes.
PyWavelets DWT implementation:
PyWavelets (pywt) is optimal for direct wavelet transform manipulation in Python. coeffs = pywt.dwt2(image, 'db4') executes 1-level 2D DWT, returning (cA, (cH, cV, cD)) format approximation and detail coefficients. Multi-level decomposition uses pywt.wavedec2(image, 'db4', level=5).
Simple wavelet compression implementation:
Educational compression sets small DWT coefficients to zero (thresholding) then reconstructs via inverse transform. Decompose with coeffs = pywt.wavedec2(img, 'bior4.4', level=5), apply thresholds zeroing small values per subband. Zero coefficient percentage corresponds to compression ratio - zeroing 95% of coefficients often maintains visually acceptable quality. This approximates JPEG 2000's quantization step.