Medical Image Processing Fundamentals - DICOM, CT, and MRI Data and Techniques
Medical Image Processing Overview - Modalities and Image Characteristics
Medical image processing analyzes images from CT, MRI, ultrasound, X-ray, and PET systems for diagnostic support and treatment planning. It presents unique challenges and requirements distinct from general image processing.
Major modalities:
- CT (Computed Tomography): Cross-sectional imaging of X-ray absorption. Used for bone, lung, abdominal diagnosis. Resolution 0.5-1mm, 16-bit (HU values)
- MRI (Magnetic Resonance Imaging): Images hydrogen nuclear magnetic resonance. Excellent soft tissue contrast. Resolution 0.5-2mm
- Ultrasound (US): Sound wave reflection imaging. Real-time, non-invasive, low cost. Resolution 1-3mm
- PET (Positron Emission Tomography): Images radiotracer distribution. Visualizes metabolic activity for cancer detection
- X-ray: Plain radiography. Chest, fracture diagnosis. 2D projection images
Medical image specifics:
- High bit depth: CT uses 12-16 bit (4096-65536 levels). Standard 8-bit processing loses information
- 3D volume data: CT/MRI consists of hundreds of slice images forming 3D datasets
- Anisotropic voxels: In-slice resolution (0.5mm) differs from slice spacing (1-5mm)
- Standards compliance: DICOM standard data management is mandatory
Medical image processing demands algorithm reliability and reproducibility since errors can lead to misdiagnosis. FDA or PMDA approval may be required for clinical deployment.
DICOM Standard - The Medical Image Format
DICOM (Digital Imaging and Communications in Medicine) is the international standard for medical image storage, communication, and display. It integrates image data with metadata including patient information, acquisition parameters, and device specifications.
DICOM file structure: DICOM files consist of tag-value pairs, each identified by (group number, element number):
- (0010,0010): Patient Name
- (0020,000D): Study Instance UID - unique study identifier
- (0028,0010): Rows - image row count
- (0028,0011): Columns - image column count
- (0028,1050): Window Center
- (0028,1051): Window Width
- (7FE0,0010): Pixel Data
Python DICOM processing: The pydicom library reads and writes DICOM files. ds = pydicom.dcmread('image.dcm') loads files, ds.pixel_array accesses pixel data as NumPy arrays. For CT images, convert to HU (Hounsfield Units): hu = pixel_value × RescaleSlope + RescaleIntercept.
DICOM hierarchy: Organized as Patient → Study → Series → Instance. A single CT examination generates 200-1000 slice images, each as one DICOM file. PACS (Picture Archiving and Communication System) centrally manages and distributes these across hospital networks.
CT Image Windowing - HU Values and Display Control
CT images express tissue X-ray absorption coefficients in Hounsfield Units (HU). HU values range from -1024 to +3071 (12-bit), but human eyes can only distinguish approximately 256 simultaneous gray levels, requiring window settings matched to the tissue of interest.
Representative HU values:
- Air: -1000 HU
- Lung: -500 HU
- Fat: -100 HU
- Water: 0 HU
- Soft tissue: +40 HU
- Bone (cancellous): +300 HU
- Bone (cortical): +1000 HU
Window setting examples:
- Lung window: WL = -600, WW = 1500. Observe fine lung structures
- Mediastinal window: WL = 40, WW = 400. Enhance soft tissue contrast
- Bone window: WL = 300, WW = 1500. Observe fractures and bone lesions
- Brain window: WL = 40, WW = 80. Detect subtle brain density differences
Windowing implementation:
display_value = (hu_value - (WL - WW/2)) / WW × 255
Values below WL-WW/2 clip to 0 (black), above WL+WW/2 clip to 255 (white). In Python: np.clip((hu - (wl - ww/2)) / ww * 255, 0, 255).astype(np.uint8). The same CT data visualizes different structures (lung, bone, soft tissue) simply by changing window settings.
Medical Image Segmentation - Automatic Organ and Lesion Extraction
Medical image segmentation automatically extracts specific organs or lesion regions from images. It is a critical process underlying treatment planning, quantitative evaluation, and surgical navigation.
Traditional methods:
- Thresholding: Separate tissues by HU range. Effective for bone (>200HU) and air (<-500HU)
- Region growing: Expand from seed points connecting similar pixels. Used for liver and kidney extraction
- Level Set: Evolve contours via curve evolution equations. Handles complex shapes
- Atlas-based: Non-rigid registration with standard brain atlases for brain region parcellation
Deep learning methods:
- U-Net (2015): Standard architecture for medical segmentation. Encoder-decoder with skip connections achieves high accuracy with limited data. Dice scores above 0.9 on many tasks
- nnU-Net (2021): Self-configuring framework that automatically optimizes network structure, preprocessing, and training parameters per dataset. Achieves state-of-the-art on 23 medical segmentation tasks
- MONAI: PyTorch-based medical imaging AI framework providing 3D U-Net, Swin UNETR, and other modern architectures
3D segmentation challenges: CT/MRI are 3D volumes where per-slice 2D processing cannot guarantee continuity. 3D U-Net preserves spatial continuity with 3D convolutions but faces GPU memory constraints (approximately 16GB for 512x512x512). Patch-based training with sliding window inference addresses this limitation.
MRI Image Characteristics and Processing Techniques
MRI uses strong magnetic fields and radiofrequency pulses to detect hydrogen nuclear signals, producing images with excellent soft tissue contrast. Unlike CT, it involves no radiation exposure and offers diverse contrast by varying acquisition parameters.
MRI contrasts:
- T1-weighted: Fat appears bright (white), water dark (black). Used for anatomical structure observation
- T2-weighted: Water appears bright (white), fat intermediate. Effective for detecting edema and inflammation
- FLAIR: T2-weighted with cerebrospinal fluid suppression. Optimal for brain lesion detection
- Diffusion-weighted (DWI): Images water molecule diffusion. Essential for early acute stroke detection
MRI-specific preprocessing:
- Bias field correction: Corrects intensity inhomogeneity from RF coil non-uniformity. N4ITK algorithm is standard, implemented via SimpleITK's
N4BiasFieldCorrectionImageFilter - Skull stripping: Removes skull and scalp from brain images, extracting brain parenchyma only. BET (FSL) and SynthStrip (FreeSurfer) are representative tools
- Standard space registration: Non-rigid registration to MNI or Talairach space. ANTs SyN algorithm provides high accuracy
Quantitative MRI: Recent advances in T1 mapping and T2 mapping quantitatively measure tissue physical parameters. Unlike qualitative conventional images, these enable numerical tissue characterization for early disease detection and treatment response quantification.
Medical Image AI in Practice - Development to Clinical Deployment
This section covers concrete tools, datasets, regulatory compliance, and quality management for developing and deploying medical image AI, showing the pipeline from research to clinical implementation.
Development tools and frameworks:
- MONAI: Developed by NVIDIA and King's College London. PyTorch extension specialized for medical imaging with 3D data loaders, medical transforms, and loss functions
- SimpleITK: Python wrapper for ITK. Provides registration, filtering, and segmentation fundamentals
- 3D Slicer: Open-source medical image viewer and analysis platform. Extensible via plugins
- FreeSurfer: Specialized for brain MRI analysis. Standard tool for cortical thickness and brain parcellation
Public datasets:
- MICCAI Challenges: Annual medical image analysis competitions providing diverse task datasets
- Medical Segmentation Decathlon: 10 organ segmentation tasks. nnU-Net benchmark
- TCIA (The Cancer Imaging Archive): Large-scale cancer imaging archive with tens of thousands of freely available CT/MRI cases
Regulation and quality management: Clinical use of medical image AI requires PMDA Class II medical device approval in Japan, or FDA 510(k)/De Novo submission in the US. Development must comply with IEC 62304 (medical device software lifecycle). Addressing training data bias (race, age, device variation) is critical, with multi-site validation recommended for robust clinical performance.