JA EN

Image Inpainting Technology and Applications - From Classical Methods to Deep Learning

· 9 min read

What is Image Inpainting - Technology for Naturally Filling Missing Regions

Image inpainting naturally restores specified regions (masks) in images using surrounding information. It is an important field with broad applications including unwanted object removal from photos, scratch repair in old photographs, watermark removal, and text overlay removal.

Inpainting inputs and outputs:

Technical challenges: Inpainting is inherently an ill-posed problem. No "correct answer" exists for damaged regions; the most natural-looking content must be estimated from surrounding context. Small scratch repair (few pixels) is relatively easy, but large region repair (20%+ of image) is extremely difficult, requiring semantic understanding.

Method classification:

Selecting appropriate methods based on repair region size and content complexity is crucial. Diffusion-based for small scratches, patch-based for moderate texture regions, and deep learning for large structural damage.

Classical Methods - Navier-Stokes and Telea Algorithms

Two classical inpainting algorithms implemented in OpenCV are explained. Both are suitable for small damage repair (scratches, text, thin lines) with low computational cost and easy implementation.

Navier-Stokes method (cv2.INPAINT_NS): Proposed by Bertalmio et al. (2001), inspired by fluid dynamics Navier-Stokes equations. Propagates image isophotes (curves of equal intensity) from damaged region boundaries inward.

Principle: Computes gradient direction and magnitude at damaged region boundaries, iteratively propagating this information inward. Preserving isophote curvature naturally maintains edge continuity.

Telea method (cv2.INPAINT_TELEA): FMM (Fast Marching Method) based approach proposed by Alexandru Telea (2004). Fills values from damaged region boundaries inward using distance-weighted averages.

Principle: Processes pixels sequentially from boundary inward, determining values as weighted averages of known pixels (closer pixels weighted higher). FMM optimizes processing order for fast operation.

OpenCV implementation:

result = cv2.inpaint(img, mask, inpaintRadius=3, flags=cv2.INPAINT_TELEA)

inpaintRadius is the neighborhood radius (pixels) referenced for each pixel's repair. 3-5 is typical; larger values produce smoother results but increase processing time.

Performance comparison: For 1920x1080 image with 1% mask area, Telea takes approximately 50ms, NS approximately 200ms. NS excels at edge preservation while Telea excels at uniform region repair. Both methods' quality degrades rapidly when mask area exceeds 5%, producing blurry results.

Patch-Based Methods - PatchMatch and Criminisi Algorithm

Patch-based inpainting searches for similar patches (small rectangular regions) from known areas within the image and copies them to damaged regions. It excels at texture reproduction and generates high-quality results for moderate damage (5-15% of image).

Criminisi algorithm (2004): Repairs from damaged region boundaries based on priority. Prioritizes boundary pixels where edges (structure) pass, guaranteeing structural continuity before filling textures.

Procedure: (1) Compute boundary pixel priorities (confidence × data term). (2) Set patch (approximately 9x9) centered on highest-priority pixel. (3) Search for most similar patch in known regions using SSD. (4) Copy corresponding portion of found patch to damaged region. (5) Update boundary and repeat.

PatchMatch (2009): Fast approximate nearest-neighbor patch search algorithm by Barnes et al. Iterates three steps: random initialization → propagation → random search to find near-optimal nearest neighbors for all patches.

Converges in 5-6 iterations, finding approximate nearest neighbors 100-1000x faster than exhaustive search. Foundation technology behind Adobe Photoshop's "Content-Aware Fill."

Performance: For 1920x1080 image with 10% mask area, approximately 2-5 seconds (CPU). GPU implementation achieves 200-500ms. Particularly excels at repairing repeating texture patterns (grass, walls, sky).

Deep Learning-Based Inpainting - Repair Through Semantic Understanding

Deep learning inpainting leverages semantic knowledge learned from large image datasets, achieving repair of large damaged regions and generation of structurally complex content impossible with conventional methods.

Partial Convolution (2018): Proposed by NVIDIA, uses special convolution operations that ignore masked regions. While standard CNNs are affected by zero values in masked areas, Partial Conv performs normalized convolution using only valid pixels, suppressing mask boundary artifacts.

Gated Convolution (2019): Improvement over Partial Conv with learnable gating mechanisms that dynamically judge each pixel's validity. Generates more natural results for irregular mask shapes.

LaMa (Large Mask Inpainting, 2022): One of the current highest-performing inpainting models. Uses Fast Fourier Convolution (FFC) to efficiently achieve large receptive fields, enabling repair considering entire image structure. Generates natural results even when 50%+ of the image is missing.

Stable Diffusion Inpainting: Diffusion model-based inpainting allowing text prompts to direct repair content. Semantic instructions like "fill this region with blue sky" or "remove person and restore background" are possible, suitable for creative applications.

Performance comparison (FID score, lower is better):

Practical Application Patterns - From Object Removal to Data Augmentation

Practical application patterns for inpainting technology are presented with appropriate method selection guidelines.

Unwanted object removal from photos: Removing pedestrians from tourist photos, power lines, trash cans, and cones. Use segmentation models (SAM: Segment Anything Model) for automatic mask generation. Small objects work with Telea, but large objects (10%+ of image) require LaMa or Stable Diffusion.

Watermark removal: Semi-transparent watermarks cannot be handled with simple masks. Blind Watermark Removal techniques estimating watermark alpha values and back-calculating original images are being researched. However, ethical considerations regarding copyright protection are necessary.

Old photo restoration: Repairing scratches, creases, and fading in scanned old photos. Automatable via scratch detection (edge detection + morphological processing) → mask generation → inpainting pipeline. Fading is addressed with histogram correction; inpainting applies only to physical damage.

Machine learning data augmentation: Cutout + Inpainting randomly masks image portions and restores them to increase training data diversity. Models become robust to occlusion, improving generalization performance.

Video inpainting: Temporal consistency is crucial for object removal from video. Independent per-frame inpainting causes flickering in repair results. Methods ensuring temporal consistency via optical flow (Video Inpainting) are being researched, with STTN (Spatial-Temporal Transformer Network) showing high-quality results.

Automated Mask Generation and Quality Evaluation

Inpainting quality heavily depends on mask accuracy. Appropriate mask generation methods and quantitative quality evaluation metrics for repair results are explained.

Manual mask generation: The most intuitive method of painting repair regions with brush tools. Widely used in Photoshop, GIMP, and web applications. Optimal when precise masks are needed but unsuitable for bulk processing.

Automated mask generation:

Mask dilation: Detected masks often represent tight object boundaries; dilating masks by 3-5 pixels is common practice to improve repair quality. This reduces boundary artifacts.

Quality evaluation metrics:

Subjective evaluation: Human visual assessment remains most important. Even with high numerical metrics, semantically unnatural repairs (generating objects that shouldn't exist) are judged as low quality.

Related Articles

Background Removal Technical Guide - Segmentation and Matting Explained

Technical explanation of background removal techniques. Compare semantic segmentation, trimap-based alpha matting, and edge detection approaches with their accuracy differences.

Image Noise Reduction Principles and Practice - Complete Guide to Digital Photo Denoising

From noise generation causes to removal algorithms and practical workflows. Learn how to handle noise from high-ISO and low-light photography effectively.

GAN Image Applications - Adversarial Networks for Style Transfer, Generation, and Restoration

Systematic explanation of GAN applications in image processing. Covers StyleGAN, Pix2Pix, CycleGAN principles and implementation with practical patterns for style transfer, generation, and restoration.

Alpha Matting Techniques Explained - Achieving Precise Foreground Extraction from Natural Images

Complete guide to image matting from fundamentals to deep learning methods. Covers trimap design, closed-form matting, and modern deep matting with implementation comparisons.

Image Auto-Tagging Technology - Object Detection, Scene Recognition, and Caption Generation

AI-powered image auto-tagging technology explained. Covers object detection (YOLO), scene recognition, image caption generation mechanisms, and web application implementation with practical examples.

Deep Learning Super Resolution - Evolution from SRCNN to Real-ESRGAN and Practice

Systematic explanation of deep learning image super resolution development. Covers principles, performance comparison, and deployment of major models from SRCNN to Real-ESRGAN.

Related Terms