Image Inpainting Technology and Applications - From Classical Methods to Deep Learning

2025-05-07 · 9 min read

What is Image Inpainting - Technology for Naturally Filling Missing Regions

Image inpainting naturally restores specified regions (masks) in images using surrounding information. It is an important field with broad applications including unwanted object removal from photos, scratch repair in old photographs, watermark removal, and text overlay removal.

Inpainting inputs and outputs:

Input 1: Target image (containing damaged regions)
Input 2: Mask image (regions to repair marked in white)
Output: Image with damaged regions naturally filled

Technical challenges: Inpainting is inherently an ill-posed problem. No "correct answer" exists for damaged regions; the most natural-looking content must be estimated from surrounding context. Small scratch repair (few pixels) is relatively easy, but large region repair (20%+ of image) is extremely difficult, requiring semantic understanding.

Method classification:

Diffusion-based: Propagates surrounding pixel values into damaged regions (Navier-Stokes, Telea)
Patch-based: Searches and copies similar patches from within the image (PatchMatch, Criminisi)
Deep learning-based: Generates semantically consistent content via neural networks (Partial Conv, LaMa)

Selecting appropriate methods based on repair region size and content complexity is crucial. Diffusion-based for small scratches, patch-based for moderate texture regions, and deep learning for large structural damage.

Classical Methods - Navier-Stokes and Telea Algorithms

Two classical inpainting algorithms implemented in OpenCV are explained. Both are suitable for small damage repair (scratches, text, thin lines) with low computational cost and easy implementation.

Navier-Stokes method (cv2.INPAINT_NS): Proposed by Bertalmio et al. (2001), inspired by fluid dynamics Navier-Stokes equations. Propagates image isophotes (curves of equal intensity) from damaged region boundaries inward.

Principle: Computes gradient direction and magnitude at damaged region boundaries, iteratively propagating this information inward. Preserving isophote curvature naturally maintains edge continuity.

Telea method (cv2.INPAINT_TELEA): FMM (Fast Marching Method) based approach proposed by Alexandru Telea (2004). Fills values from damaged region boundaries inward using distance-weighted averages.

Principle: Processes pixels sequentially from boundary inward, determining values as weighted averages of known pixels (closer pixels weighted higher). FMM optimizes processing order for fast operation.

OpenCV implementation:

result = cv2.inpaint(img, mask, inpaintRadius=3, flags=cv2.INPAINT_TELEA)

inpaintRadius is the neighborhood radius (pixels) referenced for each pixel's repair. 3-5 is typical; larger values produce smoother results but increase processing time.

Performance comparison: For 1920x1080 image with 1% mask area, Telea takes approximately 50ms, NS approximately 200ms. NS excels at edge preservation while Telea excels at uniform region repair. Both methods' quality degrades rapidly when mask area exceeds 5%, producing blurry results.

Patch-Based Methods - PatchMatch and Criminisi Algorithm

Patch-based inpainting searches for similar patches (small rectangular regions) from known areas within the image and copies them to damaged regions. It excels at texture reproduction and generates high-quality results for moderate damage (5-15% of image).

Criminisi algorithm (2004): Repairs from damaged region boundaries based on priority. Prioritizes boundary pixels where edges (structure) pass, guaranteeing structural continuity before filling textures.

Procedure: (1) Compute boundary pixel priorities (confidence × data term). (2) Set patch (approximately 9x9) centered on highest-priority pixel. (3) Search for most similar patch in known regions using SSD. (4) Copy corresponding portion of found patch to damaged region. (5) Update boundary and repeat.

PatchMatch (2009): Fast approximate nearest-neighbor patch search algorithm by Barnes et al. Iterates three steps: random initialization → propagation → random search to find near-optimal nearest neighbors for all patches.

Random initialization: Assign random offsets to each patch
Propagation: Propagate good matches from adjacent patches (exploiting spatial coherence of good matches)
Random search: Search exponentially shrinking ranges around current match

Converges in 5-6 iterations, finding approximate nearest neighbors 100-1000x faster than exhaustive search. Foundation technology behind Adobe Photoshop's "Content-Aware Fill."

Performance: For 1920x1080 image with 10% mask area, approximately 2-5 seconds (CPU). GPU implementation achieves 200-500ms. Particularly excels at repairing repeating texture patterns (grass, walls, sky).

Deep Learning-Based Inpainting - Repair Through Semantic Understanding

Deep learning inpainting leverages semantic knowledge learned from large image datasets, achieving repair of large damaged regions and generation of structurally complex content impossible with conventional methods.

Partial Convolution (2018): Proposed by NVIDIA, uses special convolution operations that ignore masked regions. While standard CNNs are affected by zero values in masked areas, Partial Conv performs normalized convolution using only valid pixels, suppressing mask boundary artifacts.

Gated Convolution (2019): Improvement over Partial Conv with learnable gating mechanisms that dynamically judge each pixel's validity. Generates more natural results for irregular mask shapes.

LaMa (Large Mask Inpainting, 2022): One of the current highest-performing inpainting models. Uses Fast Fourier Convolution (FFC) to efficiently achieve large receptive fields, enabling repair considering entire image structure. Generates natural results even when 50%+ of the image is missing.

Stable Diffusion Inpainting: Diffusion model-based inpainting allowing text prompts to direct repair content. Semantic instructions like "fill this region with blue sky" or "remove person and restore background" are possible, suitable for creative applications.

Performance comparison (FID score, lower is better):

Telea: FID 45-60 (effective only for small masks)
PatchMatch: FID 25-35
Partial Conv: FID 15-20
LaMa: FID 8-12
Stable Diffusion: FID 5-10 (consistency challenges remain)

Practical Application Patterns - From Object Removal to Data Augmentation

Practical application patterns for inpainting technology are presented with appropriate method selection guidelines.

Unwanted object removal from photos: Removing pedestrians from tourist photos, power lines, trash cans, and cones. Use segmentation models (SAM: Segment Anything Model) for automatic mask generation. Small objects work with Telea, but large objects (10%+ of image) require LaMa or Stable Diffusion.

Watermark removal: Semi-transparent watermarks cannot be handled with simple masks. Blind Watermark Removal techniques estimating watermark alpha values and back-calculating original images are being researched. However, ethical considerations regarding copyright protection are necessary.

Old photo restoration: Repairing scratches, creases, and fading in scanned old photos. Automatable via scratch detection (edge detection + morphological processing) → mask generation → inpainting pipeline. Fading is addressed with histogram correction; inpainting applies only to physical damage.

Machine learning data augmentation: Cutout + Inpainting randomly masks image portions and restores them to increase training data diversity. Models become robust to occlusion, improving generalization performance.

Video inpainting: Temporal consistency is crucial for object removal from video. Independent per-frame inpainting causes flickering in repair results. Methods ensuring temporal consistency via optical flow (Video Inpainting) are being researched, with STTN (Spatial-Temporal Transformer Network) showing high-quality results.

Image restoration technique books can be found on Amazon

Automated Mask Generation and Quality Evaluation

Inpainting quality heavily depends on mask accuracy. Appropriate mask generation methods and quantitative quality evaluation metrics for repair results are explained.

Manual mask generation: The most intuitive method of painting repair regions with brush tools. Widely used in Photoshop, GIMP, and web applications. Optimal when precise masks are needed but unsuitable for bulk processing.

Automated mask generation:

Segmentation: SAM (Segment Anything Model) generates object masks with a single click. High accuracy handling complex shapes.
Object detection: YOLO detects bounding boxes, using interiors as masks. Fast but coarse accuracy.
Text detection: EAST or CRAFT detects text regions, generating masks for text removal.
Scratch detection: Old photo scratches are auto-detectable via edge detection + morphological processing + color difference analysis.

Mask dilation: Detected masks often represent tight object boundaries; dilating masks by 3-5 pixels is common practice to improve repair quality. This reduces boundary artifacts.

Quality evaluation metrics:

PSNR / SSIM: Numerical evaluation when original is known. PSNR 30dB+ and SSIM 0.95+ indicate high quality.
FID (Frechet Inception Distance): Distance between generated and real image distributions. Lower is more natural.
LPIPS (Learned Perceptual Image Patch Similarity): Perceptual similarity with high correlation to human judgment.

Subjective evaluation: Human visual assessment remains most important. Even with high numerical metrics, semantically unnatural repairs (generating objects that shouldn't exist) are judged as low quality.

Image Inpainting Technology and Applications - From Classical Methods to Deep Learning

What is Image Inpainting - Technology for Naturally Filling Missing Regions

Classical Methods - Navier-Stokes and Telea Algorithms

Patch-Based Methods - PatchMatch and Criminisi Algorithm

Deep Learning-Based Inpainting - Repair Through Semantic Understanding

Practical Application Patterns - From Object Removal to Data Augmentation

Automated Mask Generation and Quality Evaluation

Related Articles

Background Removal Technical Guide - Segmentation and Matting Explained

Image Noise Reduction Principles and Practice - Complete Guide to Digital Photo Denoising

GAN Image Applications - Adversarial Networks for Style Transfer, Generation, and Restoration

Alpha Matting Techniques Explained - Achieving Precise Foreground Extraction from Natural Images

Image Auto-Tagging Technology - Object Detection, Scene Recognition, and Caption Generation

Deep Learning Super Resolution - Evolution from SRCNN to Real-ESRGAN and Practice

Related Terms