Segmentation
The process of partitioning an image into meaningful regions (objects, background, parts) by assigning labels to each pixel, a core technique in image analysis.
Segmentation assigns a semantic label to every pixel in an image, partitioning it into meaningful regions. Unlike object detection with bounding boxes, segmentation delivers pixel-precise boundaries for detailed scene understanding.
Segmentation categories:
- Semantic segmentation: Assigns class labels (person, car, road) to each pixel without distinguishing individual instances
- Instance segmentation: Identifies individual objects separately, distinguishing person A from person B
- Panoptic segmentation: Unifies both, handling background classes and foreground instances simultaneously
Classical versus deep learning approaches:
- Classical: Thresholding, region growing, Watershed, Graph Cut. Efficient but limited for complex scenes
- Deep learning: FCN, U-Net, DeepLab, Mask R-CNN. Achieve state-of-the-art accuracy across diverse domains
Applications include autonomous driving (road and pedestrian delineation), medical imaging (organ and tumor extraction), remote sensing (land-use classification), and video editing (background removal). Standard metrics are IoU (Intersection over Union) and the Dice coefficient.