Data Augmentation
A technique that artificially increases training data diversity by applying transformations such as rotation, flipping, and color jittering, improving model generalization and reducing overfitting.
Data augmentation applies geometric, photometric, and other transformations to existing training samples, artificially expanding the effective dataset size. Since deep learning models require large amounts of labeled data and annotation is expensive, augmentation is a standard strategy to reduce overfitting and improve generalization.
In image recognition, achieving competitive accuracy without augmentation is virtually impossible.
- Geometric transforms: Horizontal flipping, random cropping, rotation, scaling, and affine transforms increase spatial diversity and teach invariance to position and orientation
- Photometric transforms: Random adjustments to brightness, contrast, saturation, and hue improve robustness to varying lighting conditions and camera characteristics
- Advanced methods: Mixup (linear interpolation of two images and labels), CutMix (patch-based mixing), and RandAugment (automated search over transform types and magnitudes) provide regularization beyond simple transforms
Modern approaches like AutoAugment and TrivialAugment automatically search for optimal augmentation policies. Generative models are increasingly used to synthesize additional training data for rare classes. Test-Time Augmentation (TTA), which applies augmentations at inference and averages predictions, further boosts accuracy without retraining.