COMP9517 Computer Vision 2024 Term 2 Week 2 Image Processing Part 1 PDF
Document Details
Uploaded by FastGrowingJackalope
UNSW Sydney
Dr Dong Gong
Tags
Related
- Face Detection CS 4731 Fall 2024 PDF
- Digital Image Processing Lecture Notes PDF
- Introduction to Computer Vision - Fall 2024 - Toronto Metropolitan University PDF
- Introduction to Computer Vision Fall 2024 Lecture Notes PDF
- Introduction to Computer Vision PDF Fall 2024
- Image Classification / Regression and More 2024 PDF
Summary
This document is lecture notes for a Computer Vision course. It covers topics like spatial filtering, convolution, and different image processing techniques. The course is likely to be at a university level.
Full Transcript
COMP9517 Computer Vision 2024 Term 2 Week 2 Dr Dong Gong Image Processing Part 1 Types of image processing (recap) Two main types of image processing operations: – Spatial domain operations (in image space) Next time...
COMP9517 Computer Vision 2024 Term 2 Week 2 Dr Dong Gong Image Processing Part 1 Types of image processing (recap) Two main types of image processing operations: – Spatial domain operations (in image space) Next time – Transform domain operations (mainly in Fourier space) Two main types of spatial domain operations: – Point operations (intensity transformations on individual pixels) Today – Neighbourhood operations (spatial filtering on groups of pixels) Copyright (C) UNSW COMP9517 24T2W2 Image Processing Part 1 2 Point operations (recap) Copyright (C) UNSW COMP9517 24T2W2 Image Processing Part 1 3 Neighbourhood operations Copyright (C) UNSW COMP9517 24T2W2 Image Processing Part 1 4 Topics and learning goals Describe the workings of neighborhood operations Convolution, spatial filtering, linear shift-invariant operations, border problem Understand the effects of various filtering methods Uniform filter, Gaussian filter, median filter, smoothing, differentiation, separability, pooling Combine filtering operations to perform image enhancement Sharpening, unsharp masking, gradient vector & magnitude, edge detection Copyright (C) UNSW COMP9517 24T2W2 Image Processing Part 1 5 Spatial filtering on groups of pixels Use the gray values in a small neighbourhood of a pixel in the input image to produce a new gray value for that pixel in the output image Also called filtering techniques because, depending on the weights applied to the pixel values, they can suppress (filter out) or enhance information Neighbourhood of (𝑥, 𝑦) is usually a square or rectangular subimage centred at (𝑥, 𝑦) and called a filter, mask, kernel, template, window Copyright (C) UNSW COMP9517 24T2W2 Image Processing Part 1 6 Spatial filtering on groups of pixels Example: Blur/low-pass filtering; Replaces each pixel with an (weighted) average of its neighborhood; Achieve smoothing effect (remove sharp features) Copyright (C) UNSW COMP9517 24T2W2 Image Processing Part 1 7 Spatial filtering on groups of pixels Use the gray values in a small neighbourhood of a pixel in the input image to produce a new gray value for that pixel in the output image Also called filtering techniques because, depending on the weights applied to the pixel values, they can suppress (filter out) or enhance information Neighbourhood of (𝑥, 𝑦) is usually a square or rectangular subimage centred at (𝑥, 𝑦) and called a filter, mask, kernel, template, window Typical kernel sizes are 3×3 pixels, 5×5 pixels, 7×7 pixels, but can be larger and have different shape (e.g. circular rather than rectangular) Copyright (C) UNSW COMP9517 24T2W2 Image Processing Part 1 8 Spatial filtering by convolution The output image 𝑜(𝑥, 𝑦) is computed by discrete convolution of the given input image 𝑓(𝑥, 𝑦) and kernel ℎ(𝑥, 𝑦): $ & 𝑜 𝑥, 𝑦 = / / 𝑓(𝑥 − 𝑖, 𝑦 − 𝑗)ℎ 𝑖, 𝑗 !"#$ %"#& Copyright (C) UNSW COMP9517 24T2W2 Image Processing Part 1 9 Spatial filtering by convolution Results of mean filter and Gaussian filter $ & 𝑜 𝑥, 𝑦 = / / 𝑓(𝑥 − 𝑖, 𝑦 − 𝑗)ℎ 𝑖, 𝑗 !"#$ %"#& Credit: Steve Seitz Copyright (C) UNSW COMP9517 24T2W2 Image Processing Part 1 10 Fixing the border problem Expand the image outside the original border using: – Padding: Set all additional pixels to a constant (zero) value Hard transitions yield border artifacts (requires windowing) – Clamping: Repeat all border pixel values indefinitely Better border behaviour but arbitrary (no theoretical foundation) – Wrapping: Copy pixel values from opposite sides Implicitly used in the (fast) Fourier transform – Mirroring: Reflect pixel values across borders Smooth, symmetric, periodic, no boundary artifacts Copyright (C) UNSW COMP9517 24T2W2 Image Processing Part 1 11 Fixing the border problem Padding Wrapping Clamping Mirroring Copyright (C) UNSW COMP9517 24T2W2 Image Processing Part 1 12 Spatial filtering by convolution Convolution is a linear, shift-invariant operation Linearity: If input 𝑓'(𝑥, 𝑦) yields output 𝑔'(𝑥, 𝑦) and 𝑓((𝑥, 𝑦) yields 𝑔((𝑥, 𝑦), then a linear combination of inputs 𝑎'𝑓' 𝑥, 𝑦 + 𝑎(𝑓((𝑥, 𝑦) yields the same combination of outputs 𝑎'𝑔' 𝑥, 𝑦 + 𝑎(𝑔((𝑥, 𝑦), for any constants 𝑎', 𝑎( Shift invariance: If input 𝑓(𝑥, 𝑦) yields output 𝑔(𝑥, 𝑦), then the shifted input 𝑓(𝑥 − ∆𝑥, 𝑦 − ∆𝑦) yields the shifted output 𝑔(𝑥 − ∆𝑥, 𝑦 − ∆𝑦), in other words, the operation does not discriminate between spatial positions Copyright (C) UNSW COMP9517 24T2W2 Image Processing Part 1 13 Properties of convolution For any set of images (functions) 𝑓! the convolution operation ∗ satisfies: Commutativity: 𝑓' ∗ 𝑓( = 𝑓( ∗ 𝑓' Associativity: 𝑓' ∗ (𝑓( ∗ 𝑓)) = (𝑓'∗ 𝑓() ∗ 𝑓) Distributivity: 𝑓' ∗ 𝑓( + 𝑓) = 𝑓' ∗ 𝑓( + 𝑓' ∗ 𝑓) Multiplicativity: 𝑎 8 𝑓' ∗ 𝑓( = 𝑎 8 𝑓' ∗ 𝑓( = 𝑓' ∗ (𝑎 8 𝑓() Derivation: 𝑓' ∗ 𝑓( * = 𝑓'* ∗ 𝑓( = 𝑓' ∗ 𝑓(* Theorem: 𝑓' ∗ 𝑓( ↔ 𝑓:' 8 𝑓:( convolution in spatial domain amounts to multiplication in spectral domain… (next time) Copyright (C) UNSW COMP9517 24T2W2 Image Processing Part 1 14 Simplest smoothing filter Calculates the mean pixel value in a neighbourhood 𝑁 with 𝑁 pixels 1 𝑔 𝑥, 𝑦 = // 𝑓(𝑥 + 𝑖, 𝑦 + 𝑗) 𝑁 (!,%)∈/ Often used for image blurring and noise reduction Reduces fluctuations due to disturbances in image acquisition Neighbourhood averaging also blurs the object edges in the image Can use weighted averaging to give more importance to some pixels Copyright (C) UNSW COMP9517 24T2W2 Image Processing Part 1 15 Simplest smoothing filter Also called uniform filter as it implicitly uses a uniform kernel 3x3 5x5 7x7 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 𝑢! = 5 1 1 1 𝑢" = 5 1 1 1 1 1 𝑢# = 5 1 1 1 1 1 1 1 … and so forth 9 25 1 1 1 1 1 49 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 𝑓 𝑓 ∗ 𝑢! 𝑓 ∗ 𝑢" 𝑓 ∗ 𝑢# … and so forth Copyright (C) UNSW COMP9517 24T2W2 Image Processing Part 1 16 Gaussian filter 3x3 𝜎 = 0.5 5x5 𝜎 = 1.0 7x7 𝜎 = 1.5 The Gaussian filter is one of the most important basic image filters < ! =>! 9x9 1 # 𝜎 = 2.0 𝑔; 𝑥, 𝑦 = e (;! 11 x 11 2𝜋𝜎 ( 𝜎 = 2.5 𝑔$ 𝑥, 𝑦 𝑦 𝑥 Copyright (C) UNSW COMP9517 24T2W2 Image Processing Part 1 17 Gaussian filter Many nice properties motivate the use of the Gaussian filter: It is the only filter that is both separable and circularly symmetric It has optimal joint localization in spatial and frequency domain The Fourier transform of a Gaussian is also a Gaussian function The n-fold convolution of any low-pass filter converges to a Gaussian It is infinitely smooth so it can be differentiated to any desired degree It scales naturally (sigma) and allows for consistent scale-space theory Copyright (C) UNSW COMP9517 24T2W2 Image Processing Part 1 18 Gaussian filtering examples Input Gaussian smoothed… 𝜎 = 0.5 𝜎 = 1.0 𝜎 = 1.5 𝜎 = 2.0 𝜎 = 2.5 Copyright (C) UNSW COMP9517 24T2W2 Image Processing Part 1 19 Gaussian filtering examples Input Gaussian smoothed Copyright (C) UNSW COMP9517 24T2W2 Image Processing Part 1 20 Median filter Is an order-statistics filter (based on ordering and ranking pixel values) Calculates the median pixel value in a neighbourhood 𝑁 with 𝑁 pixels The median value 𝑚 of a set of ordered values is the middle value At most half the values in the set are < 𝑚 and the other half > 𝑚 Set: 115, 10, 25, 12, 221, 46, 91, 178, 193 In the case of an even number of values, often the median is taken to be the arithmetic Ordered: 10, 12, 25, 46, 91, 115, 178, 193, 221 mean of the two middle values Median Copyright (C) UNSW COMP9517 24T2W2 Image Processing Part 1 21 Median filter Input Output 69 19 69 37 19 37 37 51 43 44 19 43 ? 51 44 48 58 68 43 48 44 51 48 58 58 68 68 69 Taking the minimum or maximum instead of the median is called min-filtering and max-filtering respectively Copyright (C) UNSW COMP9517 24T2W2 Image Processing Part 1 22 Median filter Forces pixels with distinct intensities to be more like their neighbours It eliminates isolated intensity spikes (salt and pepper image noise) Neighbourhood is typically of size 𝒏×𝒏 pixels with 𝑛 = 3, 5, 7, … This also eliminates pixel clusters (light or dark) with area < 𝑛(/2 Is not a convolution filter but an example of a nonlinear filter Copyright (C) UNSW COMP9517 24T2W2 Image Processing Part 1 23 Median filtering example Input 3 x 3 mean filtered 3 x 3 median filtered Noise pixels are completely removed rather than averaged out Copyright (C) UNSW COMP9517 24T2W2 Image Processing Part 1 24 Gaussian versus median filtering Original Gaussian Median ü û Example 1 Gaussian filtering is best if small objects must be retained û ü Example 2 Median filtering is best if small objects must be removed Copyright (C) UNSW COMP9517 24T2W2 Image Processing Part 1 25 Sharpening by unsharp masking Input Output + High frequencies a ̶ Gaussian filtered Copyright (C) UNSW COMP9517 24T2W2 Image Processing Part 1 26 Pooling Combines filtering and downsampling in one operation Examples include max / min / median / average pooling Makes the image smaller and reduces computations Popular in deep convolutional neural networks max pool with 2 x 2 filter and stride 2 Copyright (C) UNSW COMP9517 24T2W2 Image Processing Part 1 27 Derivative filters Gradient-domain filtering Spatial derivatives respond to intensity changes (such as object edges) In digital images they are approximated using finite differences Different possible ways to take finite differences Forward difference Backward difference Central difference 𝜕𝑓 𝜕𝑓 𝜕𝑓 ≈ 𝑓 𝑥 + 1 − 𝑓(𝑥) ≈ 𝑓 𝑥 − 𝑓(𝑥 − 1) ≈ 𝑓 𝑥 + 1 − 𝑓(𝑥 − 1) 𝜕𝑥 𝜕𝑥 𝜕𝑥 Kernel: 1 –1 1 –1 1 0 –1 Note: Kernels are flipped in the convolution process Copyright (C) UNSW COMP9517 24T2W2 Image Processing Part 1 28 Derivative filters Second-order spatial derivative using finite differences 𝜕 "𝑓 𝜕𝑓 𝜕𝑓 ≈ 𝑥 − 𝑥−1 = 𝑓 𝑥+1 −𝑓 𝑥 − 𝑓 𝑥 −𝑓 𝑥−1 = 𝑓 𝑥 + 1 − 2𝑓 𝑥 + 𝑓(𝑥 − 1) 𝜕𝑥 " 𝜕𝑥 𝜕𝑥 Backward difference Forward differences 1 –2 1 𝜕%𝑓 𝜕𝑓 1 𝜕𝑓 1 ≈ 𝑥 + − 𝑥 − = 𝑓 𝑥+1 −𝑓 𝑥 − 𝑓 𝑥 −𝑓 𝑥−1 = 𝑓 𝑥 + 1 − 2𝑓 𝑥 + 𝑓(𝑥 − 1) 𝜕𝑥 % 𝜕𝑥 2 𝜕𝑥 2 Central difference 1/2 Central differences 1/2 1 –2 1 Copyright (C) UNSW COMP9517 24T2W2 Image Processing Part 1 29 Derivative filters Sampled approximations of the continuous Gaussian derivatives 𝑔′(𝑥) 𝑔′′(𝑥) Similarly in 𝑦 𝑔(𝑥) 1 1 1 2 0 –2 1 2 1 1 0 –1 1 –2 1 1 –1 1 Copyright (C) UNSW COMP9517 24T2W2 Image Processing Part 1 30 Gaussian derivative filters 𝜕𝑔 𝜕𝑥 ≡ 𝑔! Extension of Gaussian filter kernels to 2D and different spatial scales 𝑔< 𝑔> 𝑔> 𝑔 Kernel s = 1.0 7x7 s = 3.0 19 x 19 Copyright (C) UNSW COMP9517 24T2W2 Image Processing Part 1 31 Prewitt and Sobel kernels Differentiation in one dimension and smoothing in the other dimension Prewitt Sobel 𝑝< 𝑝> 𝑠< 𝑠> Differentiation Smoothing Differentiation Smoothing 1 0 –1 1 1 1 1 0 –1 1 2 1 Differentiation Differentiation Smoothing Smoothing 1 0 –1 0 0 0 2 0 –2 0 0 0 1 0 –1 –1 –1 –1 1 0 –1 –1 –2 –1 Copyright (C) UNSW COMP9517 24T2W2 Image Processing Part 1 32 Separable filter kernels Smoothing in 𝑥 1 1 1 1 ' Uniform: 𝑢= ' J8 1 1 1 = )8 1 1 1 ∗ ') 8 1 1 1 1 1 Smoothing in 𝑦 1 0 –1 1 First derivative in 𝑥 Prewitt: 𝑝< = 1 0 –1 = 1 0 –1 ∗ 1 1 0 –1 1 Smoothing in 𝑦 1 2 1 1 Smoothing in 𝑥 Sobel: 𝑠> = 0 0 0 = 1 2 1 ∗ 0 –1 –2 –1 –1 First derivative in 𝑦 1 1 2 1 Smoothing in 𝑥 Gauss: 𝑔= ' 'K 8 2 4 2 = ' L8 1 2 1 ∗ 'L 8 2 1 2 1 1 Smoothing in 𝑦 Copyright (C) UNSW COMP9517 24T2W2 Image Processing Part 1 33 Separable filter kernels Allow for a much more computationally efficient implementation 1 2 1 " " " 𝑔(𝑥, 𝑦) = "# 9 2 4 2 → 𝑜 𝑥, 𝑦 = 𝑓 ∗ 𝑔 𝑥, 𝑦 = < < 𝑓 𝑥 − 𝑖, 𝑦 − 𝑗 𝑔(𝑖, 𝑗) 1 2 1 $%&" '%&" 9 multiplies + 8 adds = 17 ops/pixel Can be rewritten as: 𝑔(𝑥, 𝑦) = 𝑔 𝑥 𝑔(𝑦) " " 𝑜 𝑥, 𝑦 = < < 𝑓 𝑥 − 𝑖, 𝑦 − 𝑗 𝑔 𝑖 𝑔(𝑗) " 𝑔(𝑥) = (9 1 2 1 $%&" '%&" Even higher gains for larger kernels 𝑇 𝑔(𝑦) = "( 9 1 2 1 2 x (3 multiplies + 2 adds) = 10 ops/pixel and 3D images Copyright (C) UNSW COMP9517 24T2W2 Image Processing Part 1 34 Example of Prewitt filtering 𝑓 𝑓 ∗ 𝑝< 𝑓 ∗ 𝑝> 0 255 −255 0 255 Copyright (C) UNSW COMP9517 24T2W2 Image Processing Part 1 35 Laplacean filtering Approximating the sum of second-order derivatives 1 0 1 0 𝑓 → 𝑓> –2 = ∇(𝑓 1 –4 1 1 0 1 0 0 255 −255 0 255 Copyright (C) UNSW COMP9517 24T2W2 Image Processing Part 1 36 Intensity gradient vector Gradient vector (2D) * ∇𝑓 𝑥, 𝑦 = 𝑓! 𝑥, 𝑦 , 𝑓) 𝑥, 𝑦 Points in the direction of steepest intensity increase Is orthogonal to isophotes (lines of equal intensity) Gradient magnitude (2D) Isophotes ∇𝑓 𝑥, 𝑦 = 𝑓!+ 𝑥, 𝑦 + 𝑓)+ 𝑥, 𝑦 Represents the length of the gradient vector Is the magnitude of the local intensity change Copyright (C) UNSW COMP9517 24T2W2 Image Processing Part 1 37 Edge detection with the gradient magnitude 𝑓< 𝑥, 𝑦 𝑓 𝑥, 𝑦 ∇𝑓 𝑥, 𝑦 𝑓> 𝑥, 𝑦 Largest gradients Copyright (C) UNSW COMP9517 24T2W2 Image Processing Part 1 38 Edge detection with the Laplacean 𝑓> 𝑥, 𝑦 Zero crossings Copyright (C) UNSW COMP9517 24T2W2 Image Processing Part 1 39 Selecting the right spatial scale Computing image derivatives using Gaussian derivative kernels s =1 s =3 s =5 s =7 s =9 ü Edges from thresholding local maxima of ∇𝑓 𝑥, 𝑦 ü Edges from finding the zero-crossings of ∇% 𝑓 𝑥, 𝑦 Copyright (C) UNSW COMP9517 24T2W2 Image Processing Part 1 40 Differentiation in the Fourier domain Spatial domain Fourier domain f (x) fˆ (w ) ¶n f Differentiation (x) (iw ) fˆ (w ) n ¶x n suppresses low frequencies but blows n=3 (iw ) n up high frequencies n=2 (including noise) n =1 0 w Copyright (C) UNSW COMP9517 24T2W2 Image Processing Part 1 41 Sharpening using the Laplacean 𝑓 𝑥, 𝑦 ∇(𝑓 𝑥, 𝑦 𝑓 𝑥, 𝑦 − ∇(𝑓 𝑥, 𝑦 Copyright (C) UNSW COMP9517 24T2W2 Image Processing Part 1 42 Gradient Domain Editing Perez et al., “Poisson Image Editing”, 2003 Copyright (C) UNSW COMP9517 24T2W2 Image Processing Part 1 43 Further reading on discussed topics Chapter 3 of Gonzalez and Woods 2002 Sections 3.1-3.3 of Szeliski Acknowledgement Some images drawn from the above resources Copyright (C) UNSW COMP9517 24T2W2 Image Processing Part 1 44 Example exam question 0 1 0 What is the effect of the 2D convolution kernel 1 –4 1 shown on the right when applied to an image? 0 1 0 A. It approximates the sum of first-order derivatives in 𝑥 and 𝑦. B. It approximates the sum of second-order derivatives in 𝑥 and 𝑦. C. It approximates the product of first-order derivatives in 𝑥 and 𝑦. D. It approximates the product of second-order derivatives in 𝑥 and 𝑦. Copyright (C) UNSW COMP9517 24T2W2 Image Processing Part 1 45