COMP9517_24T2W1_Image_Processing_Basics.pdf
Document Details
Uploaded by FastGrowingJackalope
UNSW Sydney
2024
Tags
Full Transcript
COMP9517 Computer Vision 2024 Term 2 Week 1 Professor Erik Meijering Image Processing Basics What is image processing? Image processing = image in > image out Aims to suppress distortions and enhance relevant information Used to p...
COMP9517 Computer Vision 2024 Term 2 Week 1 Professor Erik Meijering Image Processing Basics What is image processing? Image processing = image in > image out Aims to suppress distortions and enhance relevant information Used to prepare images for further analysis and interpretation Image analysis = image in > features out Computer vision = image in > interpretation out Copyright (C) UNSW COMP9517 24T2W1 Image Processing Basics 2 Types of image processing Two main types of image processing operations: – Spatial domain operations (in image space) Next week – Transform domain operations (mainly in Fourier space) Two main types of spatial domain operations: Today – Point operations (intensity transformations on individual pixels) – Neighbourhood operations (spatial filtering on groups of pixels) Copyright (C) UNSW COMP9517 24T2W1 Image Processing Basics 3 Topics and learning goals Describe the workings of basic point operations Contrast stretching, thresholding, inversion, log/power transformations Understand and use the intensity histogram Histogram specification, equalization, matching Define arithmetic and logical operations Summation, subtraction, AND/OR, averaging Copyright (C) UNSW COMP9517 24T2W1 Image Processing Basics 4 Spatial domain operations General form of spatial domain operations 𝑔𝑔 𝑥𝑥, 𝑦𝑦 = 𝑇𝑇 𝑓𝑓 𝑥𝑥, 𝑦𝑦 where 𝑓𝑓 𝑥𝑥, 𝑦𝑦 is the input image 𝑔𝑔 𝑥𝑥, 𝑦𝑦 is the processed image 𝑇𝑇 is the operator applied at (𝑥𝑥, 𝑦𝑦) Copyright (C) UNSW COMP9517 24T2W1 Image Processing Basics 5 Spatial domain operations Point operations: 𝑇𝑇 operates on individual pixels 𝑇𝑇: ℝ ⟶ ℝ 𝑔𝑔 𝑥𝑥, 𝑦𝑦 = 𝑇𝑇 𝑓𝑓 𝑥𝑥, 𝑦𝑦 Neighbourhood operations: 𝑇𝑇 operates on multiple pixels 𝑇𝑇: ℝ𝑛𝑛 ⟶ ℝ 𝑔𝑔 𝑥𝑥, 𝑦𝑦 = 𝑇𝑇 𝑓𝑓 𝑥𝑥, 𝑦𝑦 , 𝑓𝑓 𝑥𝑥 + 1, 𝑦𝑦 , 𝑓𝑓 𝑥𝑥 − 1, 𝑦𝑦 , … Copyright (C) UNSW COMP9517 24T2W1 Image Processing Basics 6 Point operations Input Output image image Copyright (C) UNSW COMP9517 24T2W1 Image Processing Basics 7 Contrast stretching Input Output intensity 𝑇𝑇 Output 𝐿𝐿 𝐻𝐻 Input intensity Copyright (C) UNSW COMP9517 24T2W1 Image Processing Basics 8 Contrast stretching Produces images of higher contrast Puts values below 𝐿𝐿 in the input to the minimum (black) in the output Puts values above 𝐻𝐻 in the input to the maximum (white) in the output Linearly scales values between 𝐿𝐿 and 𝐻𝐻 (inclusive) in the input to between the minimum (black) and the maximum (white) in the output Copyright (C) UNSW COMP9517 24T2W1 Image Processing Basics 9 Intensity thresholding Input Output intensity Output 𝑇𝑇 Threshold level Input intensity Copyright (C) UNSW COMP9517 24T2W1 Image Processing Basics 10 Intensity thresholding Limiting case of contrast stretching Produces binary images of gray-scale images Puts values below the threshold to black in the output Puts values equal/above the threshold to white in the output Popular method for image segmentation (discussed later) Useful only if object and background intensities are very different Result depends strongly on the threshold level (user parameter) Copyright (C) UNSW COMP9517 24T2W1 Image Processing Basics 11 Intensity thresholding Input Output intensity Output 𝑇𝑇 Threshold level Input intensity Copyright (C) UNSW COMP9517 24T2W1 Image Processing Basics 12 Intensity thresholding Input Output intensity Output 𝑇𝑇 Threshold level Input intensity Copyright (C) UNSW COMP9517 24T2W1 Image Processing Basics 13 Intensity thresholding Input Output intensity Output 𝑇𝑇 Threshold level Input intensity Copyright (C) UNSW COMP9517 24T2W1 Image Processing Basics 14 Automatic intensity thresholding Otsu’s method for computing the threshold automatically https://doi.org/10.1109/TSMC.1979.4310076 Exhaustively searches for the threshold minimising the intra-class variance 2 𝜎𝜎𝑊𝑊 = 𝑝𝑝0 𝜎𝜎02 + 𝑝𝑝1 𝜎𝜎12 Equivalent to maximising the inter-class variance (much faster to compute) 𝜎𝜎𝐵𝐵2 = 𝑝𝑝0 𝑝𝑝1 𝜇𝜇0 − 𝜇𝜇1 2 Here, 𝑝𝑝0 is the fraction of pixels below the threshold (class 0), 𝑝𝑝1 is the fraction of pixels equal to or above the threshold (class 1), 𝜇𝜇0 and 𝜇𝜇1 are the mean intensities of pixels in class 0 and class 1, 𝜎𝜎02 and 𝜎𝜎12 are the intensity variances, and 𝑝𝑝0 + 𝑝𝑝1 = 1 and 𝜎𝜎02 + 𝜎𝜎12 = 𝜎𝜎 2 Copyright (C) UNSW COMP9517 24T2W1 Image Processing Basics 15 Otsu thresholding example Copyright (C) UNSW COMP9517 24T2W1 Image Processing Basics 16 Automatic intensity thresholding IsoData method for computing the threshold automatically 1. Select an arbitrary initial threshold 𝑡𝑡 2. Compute 𝜇𝜇0 and 𝜇𝜇1 with respect to the threshold 3. Update the threshold to the mean of the means: 𝑡𝑡 = 𝜇𝜇0 + 𝜇𝜇1 /2 4. If the threshold changed in Step 3, go to Step 2 Upon convergence, the threshold is midway between the two class means Copyright (C) UNSW COMP9517 24T2W1 Image Processing Basics 17 Isodata thresholding example Copyright (C) UNSW COMP9517 24T2W1 Image Processing Basics 18 Multilevel thresholding Input Output intensity 𝑇𝑇 Output Threshold #1 Threshold #2 Input intensity Copyright (C) UNSW COMP9517 24T2W1 Image Processing Basics 19 Intensity inversion Input Output intensity 𝑇𝑇 Output Input intensity Copyright (C) UNSW COMP9517 24T2W1 Image Processing Basics 20 Intensity inversion examples “Assessment of grayscale inverted images in addition to standard images facilitates the detection of microcalcification.” https://doi.org/10.1186/s12880-017-0196-6 Copyright (C) UNSW COMP9517 24T2W1 Image Processing Basics 21 Log transformation Definition of log transformation 𝑠𝑠 = 𝑐𝑐 log 1 + 𝑟𝑟 where 𝑟𝑟 is the input intensity, 𝑠𝑠 is the output Output gray level 𝑠𝑠 intensity, and 𝑐𝑐 is a constant – Maps a narrow input range of low gray-level values into a wider range of output values, and vice versa for higher gray-level values – Also compresses the dynamic range of images with large variations in pixel values (such as Fourier spectra, to be discussed later) Input gray level 𝑟𝑟 Copyright (C) UNSW COMP9517 24T2W1 Image Processing Basics 22 Power transformation Definition of power transformation 𝑠𝑠 = 𝑐𝑐 𝑟𝑟 𝛾𝛾 where 𝑐𝑐 and 𝛾𝛾 are constants Output gray level 𝑠𝑠 – Similar to (inverse) log transformation – Represents a family of transformations by varying 𝛾𝛾 – Many devices respond according to a power law – Example power transformation: gamma correction – Useful for general-purpose contrast manipulation Input gray level 𝑟𝑟 Copyright (C) UNSW COMP9517 24T2W1 Image Processing Basics 23 Power transformation examples 𝑐𝑐 = 1 Input 𝛾𝛾 = 3 𝛾𝛾 = 4 𝛾𝛾 = 5 Copyright (C) UNSW COMP9517 24T2W1 Image Processing Basics 24 Piecewise linear transformations Complementary to other transformation methods Enable more fine-tuned design of transformations Output gray level 𝑠𝑠 Can have very complex shapes Requires more user input Input gray level 𝑟𝑟 Copyright (C) UNSW COMP9517 24T2W1 Image Processing Basics 25 Piecewise contrast stretching One of the simplest piecewise linear transformations Increases the dynamic range of gray levels in images Used in display devices or recording media to span full range Input Transformed Thresholded Output gray level 𝑠𝑠 Input gray level 𝑟𝑟 Copyright (C) UNSW COMP9517 24T2W1 Image Processing Basics 26 Gray-level slicing Transform 1 Transform 2 Used to highlight a specific range of gray levels Two different slicing approaches: 1) High value for all gray levels in a range of interest and low value for all others (produces a binary image) 2) Brighten a desired range of gray levels while preserving background and other gray-scale tones of the image Input Result of 1 Copyright (C) UNSW COMP9517 24T2W1 Image Processing Basics 27 Bit-plane slicing Input Highlights contribution to total image by specific bits An image with n bits/pixel has n bit-planes Can be useful for image compression Image Bit-planes Bit-plane 7 (most significant) 151 = 1 0 0 1 0 Bit-plane 0 (least significant) 1 1 1 Copyright (C) UNSW COMP9517 24T2W1 Image Processing Basics 28 Histogram of pixel values For every possible gray-level value, count the number of pixels having that value, and plot the pixel counts as a function of gray level 𝐿𝐿 = 28 = 256 Count ℎ(𝑟𝑟) 𝑁𝑁 = #pixels 𝐿𝐿−1 ℎ(𝑟𝑟) = 𝑁𝑁 𝑟𝑟=0 Normalized histogram = probability function 1 ℎ(𝑟𝑟) = 𝑝𝑝(𝑟𝑟) 𝑁𝑁 8-bit image Level Copyright (C) UNSW COMP9517 24T2W1 Image Processing Basics 29 Histogram based thresholding Triangle method for computing the threshold automatically 1. Find the histogram peak (𝑟𝑟𝑝𝑝 , ℎ𝑝𝑝 ) and ℎ𝑝𝑝 the highest gray level point (𝑟𝑟𝑚𝑚 , ℎ𝑚𝑚 ) 𝑙𝑙(𝑟𝑟) 2. Construct a straight line 𝑙𝑙(𝑟𝑟) from the ℎ(𝑟𝑟) 𝑑𝑑(𝑟𝑟) peak to the highest gray level point 3. Find the gray level 𝑟𝑟 for which the ℎ𝑚𝑚 distance 𝑙𝑙 𝑟𝑟 − ℎ(𝑟𝑟) is the largest 0 𝑟𝑟𝑝𝑝 𝑟𝑟 𝑟𝑟𝑚𝑚 255 Copyright (C) UNSW COMP9517 24T2W1 Image Processing Basics 30 Comparison of thresholding methods Image Histogram Otsu IsoData Triangle Copyright (C) UNSW COMP9517 24T2W1 Image Processing Basics 31 Histogram processing Histogram equalization Aim: To get an image with equally distributed intensity levels over the full intensity range Histogram specification (also called histogram matching) Aim: To get an image with a specified intensity distribution, determined by the shape of the histogram Copyright (C) UNSW COMP9517 24T2W1 Image Processing Basics 32 Histogram equalization Enhances contrast for intensity values near histogram maxima and decreases contrast near histogram minima Histogram bins are much more “equal” here Copyright (C) UNSW COMP9517 24T2W1 Image Processing Basics 33 Histogram equalization Let 𝑟𝑟 ∈ 0, 𝐿𝐿 − 1 represent pixel values (intensities, gray levels) 𝑟𝑟 = 0 represents black and 𝑟𝑟 = 𝐿𝐿 − 1 represents white Consider transformations 𝑠𝑠 = 𝑇𝑇 𝑟𝑟 , 0 ≤ 𝑟𝑟 ≤ 𝐿𝐿 − 1, satisfying 1) 𝑇𝑇(𝑟𝑟) is single-valued and monotonically increasing in 0 ≤ 𝑟𝑟 ≤ 𝐿𝐿 − 1 This guarantees that the inverse transformation 𝑇𝑇 −1 (𝑠𝑠) exists 2) 0 ≤ 𝑇𝑇 𝑟𝑟 ≤ 𝐿𝐿 − 1 for 0 ≤ 𝑟𝑟 ≤ 𝐿𝐿 − 1 This guarantees that the input and output ranges will be the same Copyright (C) UNSW COMP9517 24T2W1 Image Processing Basics 34 Histogram equalization (continuous case) Consider 𝑟𝑟 and 𝑠𝑠 as continuous random variables over 0, 𝐿𝐿 − 1 with PDFs 𝑝𝑝𝑟𝑟 (𝑟𝑟) and 𝑝𝑝𝑠𝑠 (𝑠𝑠) If 𝑝𝑝𝑟𝑟 (𝑟𝑟) and 𝑇𝑇(𝑟𝑟) are known and 𝑇𝑇 −1 (𝑠𝑠) satisfies monotonicity, then, from probability theory 𝑑𝑑𝑑𝑑 𝑝𝑝𝑠𝑠 𝑠𝑠 = 𝑝𝑝𝑟𝑟 (𝑟𝑟) https://www.cl.cam.ac.uk/teaching/2003/Probability/prob11.pdf 𝑑𝑑𝑑𝑑 𝑟𝑟 Let us choose: 𝑠𝑠 = 𝑇𝑇 𝑟𝑟 = (𝐿𝐿 − 1) ∫0 𝑝𝑝𝑟𝑟 ξ 𝑑𝑑ξ This is the CDF (cumulative distribution function) of 𝑟𝑟 which satisfies conditions (1) and (2) 𝑑𝑑𝑑𝑑 𝑑𝑑𝑑𝑑(𝑟𝑟) 𝑑𝑑 𝑟𝑟 Now: = = 𝐿𝐿 − 1 ∫0 𝑝𝑝𝑟𝑟 ξ 𝑑𝑑ξ = (𝐿𝐿 − 1)𝑝𝑝𝑟𝑟 (𝑟𝑟) 𝑑𝑑𝑑𝑑 𝑑𝑑𝑑𝑑 𝑑𝑑𝑑𝑑 1 1 Therefore: 𝑝𝑝𝑠𝑠(𝑠𝑠) = 𝑝𝑝𝑟𝑟(𝑟𝑟) = for 0 ≤ 𝑠𝑠 ≤ 𝐿𝐿 − 1 (uniform distribution) 𝐿𝐿−1 𝑝𝑝𝑟𝑟 𝑟𝑟 𝐿𝐿−1 Copyright (C) UNSW COMP9517 24T2W1 Image Processing Basics 35 Histogram equalization (discrete case) For discrete values we get probabilities and summations instead of PDFs and integrals: 𝑝𝑝𝑟𝑟(𝑟𝑟𝑘𝑘) = 𝑛𝑛𝑘𝑘 /𝑀𝑀𝑀𝑀 for 𝑘𝑘 = 0, 1, … , 𝐿𝐿 − 1 where 𝑀𝑀𝑀𝑀 is the total number of pixels in image, 𝑛𝑛𝑘𝑘 is the number of pixels with gray level 𝑟𝑟𝑘𝑘 , and 𝐿𝐿 is the total number of gray levels in the image 𝐿𝐿−1 𝑘𝑘 Thus: 𝑠𝑠𝑘𝑘 = 𝑇𝑇 𝑟𝑟𝑘𝑘 = (𝐿𝐿 − 1) ∑𝑘𝑘𝑗𝑗=0 𝑝𝑝𝑟𝑟 𝑟𝑟𝑗𝑗 = ∑ 𝑛𝑛 for 𝑘𝑘 = 0, 1, … , 𝐿𝐿 − 1 𝑀𝑀𝑀𝑀 𝑗𝑗=0 𝑗𝑗 This transformation is called histogram equalization However, for discrete images, applying a single mapping function does not give a truly uniform distribution, and adaptive approaches (multiple mapping functions) are needed Copyright (C) UNSW COMP9517 24T2W1 Image Processing Basics 36 Constrained histogram equalization Input Full histogram equalization (slope of 𝑇𝑇(𝑟𝑟) is unconstrained) 𝑇𝑇(𝑟𝑟) ℎ(𝑟𝑟) ℎ(𝑟𝑟) 𝑇𝑇(𝑟𝑟) ℎ(𝑟𝑟) Constrained histogram equalization (slope of 𝑇𝑇 𝑟𝑟 is constrained) Copyright (C) UNSW COMP9517 24T2W1 Image Processing Basics 37 Histogram matching (continuous case) Assume that 𝑟𝑟 and 𝑠𝑠 are continuous and 𝑝𝑝𝑧𝑧 (𝑧𝑧) is the target distribution for the output image From our previous analysis we know the following transformation results in a uniform distribution: 𝑟𝑟 𝑝𝑝𝑟𝑟 (𝑟𝑟) 𝑠𝑠 = 𝑇𝑇 𝑟𝑟 = (𝐿𝐿 − 1) ∫0 𝑝𝑝𝑟𝑟 ξ 𝑑𝑑ξ 𝑠𝑠 = 𝑇𝑇(𝑟𝑟) Now we can define a function 𝐺𝐺(𝑧𝑧) as: 𝑝𝑝𝑠𝑠 (𝑠𝑠) −1 𝑟𝑟 = 𝑇𝑇 (𝑠𝑠) 𝑧𝑧 𝐺𝐺 𝑧𝑧 = 𝐿𝐿 − 1 ∫0 𝑝𝑝𝑧𝑧 ξ 𝑑𝑑ξ = 𝑠𝑠 𝑧𝑧 = 𝐺𝐺 −1 (𝑠𝑠) Therefore: 𝑝𝑝𝑧𝑧 (𝑧𝑧) 𝑧𝑧 = 𝐺𝐺 −1 𝑠𝑠 = 𝐺𝐺 −1 𝑇𝑇(𝑟𝑟) 𝑠𝑠 = 𝐺𝐺(𝑧𝑧) Copyright (C) UNSW COMP9517 24T2W1 Image Processing Basics 38 Histogram matching (discrete case) For discrete image values we can write: 𝐿𝐿−1 𝑘𝑘 𝑠𝑠𝑘𝑘 = 𝑇𝑇 𝑟𝑟𝑘𝑘 = (𝐿𝐿 − 1) ∑𝑘𝑘𝑗𝑗=0 𝑝𝑝𝑟𝑟 𝑟𝑟𝑗𝑗 = ∑ 𝑛𝑛 𝑀𝑀𝑀𝑀 𝑗𝑗=0 𝑗𝑗 𝑘𝑘 = 0, 1, … , 𝐿𝐿 − 1 𝑞𝑞 And: 𝐺𝐺 𝑧𝑧𝑞𝑞 = (𝐿𝐿 − 1) ∑𝑖𝑖=0 𝑝𝑝𝑧𝑧 (𝑧𝑧𝑖𝑖 ) Therefore: 𝑧𝑧𝑞𝑞 = 𝐺𝐺 −1 (𝑠𝑠𝑘𝑘 ) Copyright (C) UNSW COMP9517 24T2W1 Image Processing Basics 39 Histogram matching example https://automaticaddison.com/tag/image-processing/page/3/ Input Target Output Matching done for each colour channel (R,G,B) Copyright (C) UNSW COMP9517 24T2W1 Image Processing Basics 40 Arithmetic and logical operations Defined on a pixel-by-pixel basis between two images Input 1 Input 2 Output + = − ∗ / ^ AND OR XOR … Copyright (C) UNSW COMP9517 24T2W1 Image Processing Basics 41 Arithmetic and logical operations Useful arithmetic operations include addition and subtraction Input 1 Output - Input 2 Copyright (C) UNSW COMP9517 24T2W1 Image Processing Basics 42 Arithmetic and logical operations Useful logical operations include bitwise AND and OR Input Mask Input AND Mask Copyright (C) UNSW COMP9517 24T2W1 Image Processing Basics 43 Arithmetic and logical operations Useful logical operations include bitwise AND and OR Input Mask Input OR Mask Copyright (C) UNSW COMP9517 24T2W1 Image Processing Basics 44 Averaging Useful for example to reduce noise in images Assume the true noise-free image is 𝑔𝑔(𝑥𝑥, 𝑦𝑦) and the actual observed images are 𝑓𝑓𝑖𝑖 𝑥𝑥, 𝑦𝑦 = 𝑔𝑔 𝑥𝑥, 𝑦𝑦 + 𝑛𝑛𝑖𝑖 (𝑥𝑥, 𝑦𝑦) for 𝑖𝑖 = 1, … , 𝑁𝑁, where the 𝑛𝑛𝑖𝑖 are zero-mean, independent and identically distributed (i.i.d.) noise images, then we have E 𝑓𝑓𝑖𝑖 (𝑥𝑥, 𝑦𝑦) = 𝑔𝑔(𝑥𝑥, 𝑦𝑦) and VAR 𝑓𝑓𝑖𝑖 (𝑥𝑥, 𝑦𝑦) = VAR 𝑛𝑛𝑖𝑖 (𝑥𝑥, 𝑦𝑦) = 𝜎𝜎 2 (𝑥𝑥, 𝑦𝑦) 𝑁𝑁 𝑁𝑁 𝑁𝑁 1 1 1 → 𝑓𝑓 ̅ 𝑥𝑥, 𝑦𝑦 = 𝑓𝑓𝑖𝑖 (𝑥𝑥, 𝑦𝑦) = 𝑔𝑔 𝑥𝑥, 𝑦𝑦 + 𝑛𝑛𝑖𝑖 (𝑥𝑥, 𝑦𝑦) = 𝑔𝑔 𝑥𝑥, 𝑦𝑦 + 𝑛𝑛𝑖𝑖 (𝑥𝑥, 𝑦𝑦) 𝑁𝑁 𝑁𝑁 𝑁𝑁 𝑖𝑖=1 𝑖𝑖=1 𝑖𝑖=1 𝑁𝑁 𝑁𝑁 1 1 1 𝜎𝜎 2 (𝑥𝑥, 𝑦𝑦) → VAR 𝑛𝑛𝑖𝑖 (𝑥𝑥, 𝑦𝑦) = 2 VAR 𝑛𝑛𝑖𝑖 (𝑥𝑥, 𝑦𝑦) = 2 𝑁𝑁𝜎𝜎 2 (𝑥𝑥, 𝑦𝑦) = 𝑁𝑁 𝑁𝑁 𝑁𝑁 𝑁𝑁 𝑖𝑖=1 𝑖𝑖=1 Copyright (C) UNSW COMP9517 24T2W1 Image Processing Basics 45 Averaging Useful for example to reduce noise in images 𝑁𝑁 = 1 𝑁𝑁 = 8 𝑁𝑁 = 16 𝑁𝑁 = 64 𝑁𝑁 = 128 𝜎𝜎 𝜎𝜎/2.8 𝜎𝜎/4 𝜎𝜎/8 𝜎𝜎/11.3 Copyright (C) UNSW COMP9517 24T2W1 Image Processing Basics 46 Further reading on discussed topics Sections 3.1-3.3 of Szeliski Chapter 3 of Gonzalez and Woods 2002 Acknowledgement Some images drawn from the mentioned resources Copyright (C) UNSW COMP9517 24T2W1 Image Processing Basics 47 Example exam question Which one of the following statements about intensity transformations is incorrect? A. Contrast stretching linearly maps intensities between two values to the full output range. B. Log transformation maps a narrow range of high intensities to a wider range of output values. C. Power transformation can map intensities similar to log and inverse log transformations. D. Piecewise linear transformations can achieve contrast stretching and intensity slicing. Copyright (C) UNSW COMP9517 24T2W1 Image Processing Basics 48