JA EN

Deep Dive into Image Compression Algorithms - DCT, Wavelet Transform, and Predictive Coding

· 9 min read

Fundamental Principles of Image Compression - Redundancy Removal and Human Vision

Image compression reduces file sizes by removing redundancy in image data. Uncompressed images contain enormous data - a 4000x3000px 24-bit color image is 36MB raw, reducible to 2-5MB with JPEG compression.

Types of redundancy in image data:

Algorithm classification: Lossless (PNG Deflate, WebP Lossless) achieves 2:1-3:1 ratios with perfect reconstruction. Lossy (JPEG DCT, WebP VP8, AVIF AV1) achieves 10:1-50:1+ by removing imperceptible information. Modern compression follows a Transform, Quantization, Entropy Coding three-stage pipeline, each removing different redundancy types.

DCT (Discrete Cosine Transform) - Core Technology Behind JPEG

DCT is the mathematical transform at JPEG compression's core. It converts images from spatial domain (pixel values) to frequency domain (frequency component intensities), enabling efficient removal of high-frequency components invisible to human eyes.

JPEG DCT processing flow:

The DCT is reversible - inverse DCT (IDCT) perfectly reconstructs original values before quantization. Quantization is the only lossy step in the entire JPEG pipeline.

Wavelet Transform - Foundation of JPEG 2000 and Next-Gen Compression

Wavelet Transform was developed to overcome DCT limitations. Used in JPEG 2000, some HEIF modes, and medical imaging (DICOM). While DCT processes fixed 8x8 blocks, wavelet transform decomposes entire images at multiple resolutions.

How wavelet transform works:

Advantages over DCT: No block artifacts (processes entire image), progressive display (transmit low-to-high resolution incrementally), ROI coding (selectively preserve quality in specific regions). JPEG 2000 uses CDF 9/7 wavelet (lossy) and CDF 5/3 wavelet (lossless) with image-optimized filter coefficients.

Predictive Coding and Intra Prediction - AV1/HEVC High-Efficiency Compression

Predictive coding predicts current pixel values from already-processed adjacent pixels, encoding only the prediction residual (difference). More accurate predictions yield smaller residuals and higher compression. This is the core of AVIF (AV1) and HEIF (HEVC) compression.

Intra Prediction mechanism:

Why AV1 achieves high efficiency: Variable block sizes (4x4 to 128x128, recursively split), 56 intra prediction directions, adaptive transform type selection (DCT, ADST, Identity), loop filters (deblocking, CDEF, Loop Restoration), and ANS entropy coding.

Entropy Coding - Huffman and Arithmetic Coding Principles

Entropy coding is the final stage removing statistical redundancy. It assigns variable-length bit sequences based on symbol occurrence probability, approaching theoretical minimum size (Shannon entropy).

Major entropy coding methods:

Shannon entropy formula: H = -Σ p(x) * log2(p(x)) represents theoretical minimum bits per symbol. Equal probability (50/50) yields 1 bit/symbol; 99/1 probability yields ~0.08 bits/symbol enabling dramatic compression.

Compression Quality Metrics - PSNR, SSIM, VMAF Differences and Usage

Multiple metrics exist for objectively evaluating compression quality, each measuring different aspects. Selecting appropriate metrics enables compression parameter optimization and fair format comparisons.

Major quality metrics:

Practical selection: SSIM for fast bulk comparison, VMAF for most accurate perceptual quality, Butteraugli for automatic parameter tuning. Combining multiple metrics for comprehensive judgment is recommended.

Related Articles

Image Compression Explained - How JPEG, PNG, and WebP Work

A technical deep dive into JPEG, PNG, and WebP compression algorithms. Learn the differences between lossy and lossless compression, when to use each format, and how to optimize images for the web.

Image Format Comparison - JPEG/PNG/WebP/AVIF/GIF/BMP Features and Use Cases

Compare technical characteristics of 6 major image formats. Organized comparison of compression methods, color depth, transparency, animation, and browser support with optimal format selection by use case.

Wavelet Transform and JPEG 2000 - Multi-Resolution Analysis for Image Compression

From wavelet transform principles to JPEG 2000 compression algorithms. Covers DWT, subband coding, and EBCOT mechanisms with practical examples.

Image Processing in the Frequency Domain - Practical FFT and DCT Analysis and Filtering

Explains frequency domain image processing techniques. Covers FFT spectrum analysis, DCT compression principles, and frequency filter design with practical implementation examples.

Lossless vs Lossy Compression - How to Choose the Right Image Compression

Compare lossless and lossy compression mechanisms, characteristics, and use cases to choose the optimal compression method for your images.

Fourier Filtering for Noise Removal - Image Processing in the Frequency Domain

Explains noise removal using Fourier transforms. Covers DFT principles, low-pass/high-pass filter design, notch filters for periodic noise, and Python implementation.

Related Terms