COMP9517 Feature Representation Part 1 PDF
Document Details
Uploaded by FastGrowingJackalope
UNSW
Erik Meijering
Tags
Related
- COMP9517 Computer Vision Feature Representation Part 1 PDF
- COMP9517 Computer Vision 2024 Term 2 Week 3 Feature Representation Part 2 PDF
- COMP9517_24T2W3_Feature_Representation_Part_1.pdf
- Introduction to Computer Vision Fall 2024 Lecture Notes PDF
- FINAL COMPUTER VISION PDF
- Fundamentos de Procesado de Imagen y Video (PDF)
Summary
This document is lecture notes on computer vision, focusing on feature representation. It discusses the need for representing images as features and covers various categories of image features, including colour, texture, and shape. It outlines prominent descriptors like Haralick features, local binary patterns (LBPs), and scale-invariant feature transforms (SIFTs), and demonstrates their use in applications like image matching and stitching.
Full Transcript
COMP9517 Computer Vision 2024 Term 2 Week 3 Professor Erik Meijering Feature Representation Part 1 Topics and learning goals Explain the need for feature representation Robustness, descriptiveness, efficiency Discuss major cat...
COMP9517 Computer Vision 2024 Term 2 Week 3 Professor Erik Meijering Feature Representation Part 1 Topics and learning goals Explain the need for feature representation Robustness, descriptiveness, efficiency Discuss major categories of image features Colour features, texture features, shape features Understand prominent feature descriptors Haralick features, local binary patterns, scale-invariant feature transform Show examples of use in computer vision applications Image matching and stitching Copyright (C) UNSW COMP9517 24T2W3 Feature Representation Part 1 2 What are image features? Image features are vectors that are a compact representation of images They represent important information shown in an image Examples of image features: – Blobs – Edges Example 𝑥𝑥1 , 𝑦𝑦1 – Corners 𝑥𝑥2 , 𝑦𝑦2 – Ridges v= 𝑥𝑥3 , 𝑦𝑦3 – Circles ⋮ 𝑥𝑥𝑛𝑛 , 𝑦𝑦𝑛𝑛 – Ellipses – Lines Copyright (C) UNSW COMP9517 24T2W3 Feature Representation Part 1 3 Why do we need image features? We need to represent images as feature vectors for further processing in a more efficient and robust way Examples of further processing include: – Object detection – Image segmentation – Image classification – Image retrieval – Image stitching – Object tracking Copyright (C) UNSW COMP9517 24T2W3 Feature Representation Part 1 4 Example: object detection Copyright (C) UNSW COMP9517 24T2W3 Feature Representation Part 1 5 Example: image segmentation Copyright (C) UNSW COMP9517 24T2W3 Feature Representation Part 1 6 Example: image classification Airplane Car Training set Bird Cat Unseen Deer test image Dog Frog Class? Horse Boat Truck Copyright (C) UNSW COMP9517 24T2W3 Feature Representation Part 1 7 Example: image retrieval Database Given image Find most similar image Copyright (C) UNSW COMP9517 24T2W3 Feature Representation Part 1 8 Example: image stitching Copyright (C) UNSW COMP9517 24T2W3 Feature Representation Part 1 9 Example: object tracking Copyright (C) UNSW COMP9517 24T2W3 Feature Representation Part 1 10 Need for image features Why not just use pixel values directly as features? – Pixel values change with light intensity, colour, angle – They also change with camera orientation – And they are highly redundant Few features 1,000 x 1,000 pixels = 1,000,000 values Windows Exhaust Truck! x 3 channels = 3,000,000 values Wheel Copyright (C) UNSW COMP9517 24T2W3 Feature Representation Part 1 11 Desirable properties of features Reproducibility (robustness) – Should be detectable at the same locations in different images despite changes in illumination and viewpoint Saliency (descriptiveness) – Similar salient points in different images should have similar features Compactness (efficiency) – Fewer features – Smaller features Copyright (C) UNSW COMP9517 24T2W3 Feature Representation Part 1 12 General framework Image Formation Week 1 Object detection Week 2 Image Preprocessing Image segmentation Image classification Week 3 Feature Representation Image retrieval Deep Learning Image stitching Week 4 Pattern Recognition Weeks 7 & 8 Object tracking … Postprocessing Copyright (C) UNSW COMP9517 24T2W3 Feature Representation Part 1 13 Types of image features Colour features Part 1 Shape features Part 2 – Colour histogram – Basic shape features – Colour moments – Shape context – Histogram of oriented gradients (HOG) Texture features – Haralick texture features – Local binary patterns (LBP) – Scale-invariant feature transform (SIFT) Copyright (C) UNSW COMP9517 24T2W3 Feature Representation Part 1 14 Colour features Colour is the simplest feature to compute Invariant to image scaling, translation and rotation Example: colour-based image retrieval http://labs.tineye.com/multicolr/ Copyright (C) UNSW COMP9517 24T2W3 Feature Representation Part 1 15 Colour histogram Represent the global distribution of pixel colours in an image – Step 1: Construct a histogram for each colour channel (R, G, B) – Step 2: Concatenate the histograms (vectors) of all channels as the final feature vector 𝑟𝑟0 Histogram of R channel ⋮ 𝑟𝑟255 𝑔𝑔0 Histogram of G channel v= ⋮ 𝑔𝑔255 𝑏𝑏0 Histogram of B channel ⋮ 𝑏𝑏255 Copyright (C) UNSW COMP9517 24T2W3 Feature Representation Part 1 16 Colour moments 𝑓𝑓𝑖𝑖𝑖𝑖 = value of the 𝑖𝑖th colour component of pixel 𝑗𝑗 𝑁𝑁 = number of pixels in the image Another way of representing colour distributions First-order moment Second-order moment Third-order moment 𝑁𝑁−1 𝑁𝑁−1 𝑁𝑁−1 1 2 1 2 3 1 3 𝜇𝜇𝑖𝑖 = 𝑓𝑓𝑖𝑖𝑖𝑖 𝜎𝜎𝑖𝑖 = 𝑓𝑓𝑖𝑖𝑖𝑖 − 𝜇𝜇𝑖𝑖 𝑠𝑠𝑖𝑖 = 𝑓𝑓𝑖𝑖𝑖𝑖 − 𝜇𝜇𝑖𝑖 𝑁𝑁 𝑁𝑁 𝑁𝑁 𝑗𝑗=0 𝑗𝑗=0 𝑗𝑗=0 (Mean) (Standard Deviation) (Skewness) Moments based representation of colour distributions – Gives a feature vector of only 9 elements (for RGB images) – Lower representation capability than the colour histogram Copyright (C) UNSW COMP9517 24T2W3 Feature Representation Part 1 17 Example application Colour-based image retrieval https://doi.org/10.1016/j.csi.2010.03.004 Using only colour histogram information Using colour + texture + shape information Copyright (C) UNSW COMP9517 24T2W3 Feature Representation Part 1 18 Texture features Visual characteristics and appearance of objects Powerful discriminating feature for identifying visual patterns Properties of structural homogeneity beyond colour or intensity Especially used for texture classification https://arxiv.org/abs/1801.10324 Copyright (C) UNSW COMP9517 24T2W3 Feature Representation Part 1 19 Haralick texture features Array of statistical descriptors of image patterns Capture spatial relationship between neighbouring pixels Step 1: Construct the gray-level co-occurrence matrix (GLCM) Step 2: Compute the Haralick feature descriptors from the GLCM https://doi.org/10.1109/TSMC.1973.4309314 Copyright (C) UNSW COMP9517 24T2W3 Feature Representation Part 1 20 Haralick texture features Step 1: Construct the GLCMs – Given distance 𝑑𝑑 and orientation angle 𝑎𝑎 – Compute co-occurrence count 𝑝𝑝(𝑑𝑑,𝑎𝑎) 𝑖𝑖1 , 𝑖𝑖2 of going from gray level 𝑖𝑖1 to 𝑖𝑖2 at 𝑑𝑑 and 𝑎𝑎 – Construct matrix 𝐏𝐏(𝑑𝑑,𝑎𝑎) 𝑖𝑖1 , 𝑖𝑖2 with elements 𝑖𝑖1 , 𝑖𝑖2 being 𝑝𝑝(𝑑𝑑,𝑎𝑎) 𝑖𝑖1 , 𝑖𝑖2 – If an image has 𝐿𝐿 distinct gray levels, the matrix size is 𝐿𝐿 × 𝐿𝐿 Example 0 0 0 0 1 1 1 2 Corresponding example co-occurrence matrices: 0 0 0 1 1 2 2 3 image: 0 0 1 1 2 2 3 3 18 6 1 0 18 5 2 0 0 2 2 3 3 2 2 1 2 2 3 3 3 2 1 1 6 14 8 0 5 10 8 2 𝐿𝐿 = 4 2 3 3 3 2 2 1 0 𝐏𝐏(1,0°) = 𝐏𝐏(1,90°) = 1 8 16 10 2 8 14 11 3 3 2 2 1 1 0 0 3 2 2 1 1 0 0 0 0 0 10 14 0 2 11 14 Copyright (C) UNSW COMP9517 24T2W3 Feature Representation Part 1 21 Haralick texture features Step 1: Construct the GLCMs – For computational efficiency 𝐿𝐿 can be reduced by binning (similar to histogram binning) Example: 𝐿𝐿 = 256/𝑛𝑛 for a constant factor 𝑛𝑛 – Different co-occurrence matrices can be constructed by using various combinations of distance 𝑑𝑑 and angular orientation 𝑎𝑎 – On their own these co-occurrence matrices do not provide any measure of texture that can be easily used as texture descriptors – The information in the co-occurrence matrices needs to be further extracted as a set of feature values such as the Haralick descriptors Copyright (C) UNSW COMP9517 24T2W3 Feature Representation Part 1 22 Haralick texture features Step 2: Compute the Haralick descriptors from the GLCMs – One set of Haralick descriptors for each GLCM for a given 𝑑𝑑 and 𝑎𝑎 Copyright (C) UNSW COMP9517 24T2W3 Feature Representation Part 1 23 Many other texture features exist https://doi.org/10.1016/j.patrec.2008.04.013 Copyright (C) UNSW COMP9517 24T2W3 Feature Representation Part 1 24 Example application of Haralick features Often used in medical imaging studies due to their simplicity and interpretability C. Jensen et al. (2019) Assessment of prostate cancer prognostic Gleason grade group using zonal- specific features extracted from biparametric MRI using a KNN classifier Journal of Applied Clinical Medical Physics 20(2):146-153 https://doi.org/10.1002/acm2.12542 1. Preprocess the biparametric MRI images 2. Extract Haralick, run-length, and histogram features 3. Apply feature selection 4. Classify using KNN Copyright (C) UNSW COMP9517 24T2W3 Feature Representation Part 1 25 Local binary patterns Describe the spatial structure of local image texture – Divide the image into cells of 𝑁𝑁 × 𝑁𝑁 pixels (for example 𝑁𝑁 = 16 or 32) – Compare each pixel in a given cell to each of its 8 neighbouring pixels – If the centre pixel value is greater than the neighbour value, write 0, otherwise write 1 – This gives an 8-digit binary pattern per pixel, representing a value in the range 0…255 Example: 0 0 0 0 1 1 1 2 0 0 0 1 1 2 2 3 0 0 1 1 2 2 3 3 0 2 2 3 3 2 2 1 2 2 3 3 3 2 1 1 11110000 240 2 3 3 3 2 2 1 0 3 3 2 2 1 1 0 0 3 2 2 1 1 0 0 0 Copyright (C) UNSW COMP9517 24T2W3 Feature Representation Part 1 26 Local binary patterns Describe the spatial structure of local image texture – Count the number of times each 8-digit binary number occurs in the cell – This gives a 256-bin histogram (also known as the LBP feature vector) – Combine the histograms of all cells of the given image – This gives the image-level LBP feature descriptor Example: 0 0 0 0 1 1 1 2 11111111 255 0 0 0 1 1 2 2 3 0 0 1 1 2 2 3 3 11111111 255 0 2 2 3 3 2 2 1 2 2 3 3 3 2 1 1 11110001 241 histogram 2 3 3 3 2 2 1 0 11111011 251 3 3 2 2 1 1 0 0 3 2 2 1 1 0 0 0 … … Copyright (C) UNSW COMP9517 24T2W3 Feature Representation Part 1 27 Local binary patterns LBP can be multiresolution and rotation-invariant – Multiresolution: vary the distance between the centre pixel and neighbouring pixels and vary the number of neighbouring pixels T. Ojala, M. Pietikainen, T. Maenpaa (2002) https://doi.org/10.1109/TPAMI.2002.1017623 Multiresolution gray-scale and rotation invariant texture classification with local binary patterns IEEE Transactions on Pattern Analysis and Machine Intelligence 24(7):971-987 Copyright (C) UNSW COMP9517 24T2W3 Feature Representation Part 1 28 Local binary patterns LBP can be multiresolution and rotation-invariant – Rotation-invariant: vary the way of constructing the 8-digit binary number by performing bitwise shift to derive the smallest number Example: 11110000 = 240 11100001 = 225 11000011 = 195 10000111 = 135 15 00001111 = 15 00011110 = 30 00111100 = 60 01111000 = 120 Note: not all patterns have 8 shifted variants (e.g. 11001100 has only 4) Copyright (C) UNSW COMP9517 24T2W3 Feature Representation Part 1 29 Local binary patterns LBP can be multiresolution and rotation-invariant – Rotation-invariant: vary the way of constructing the 8-digit binary number by performing bitwise shift to derive the smallest number This reduces the LBP feature dimension from 256 to 36 Copyright (C) UNSW COMP9517 24T2W3 Feature Representation Part 1 30 Example application of LBP Texture classification https://doi.org/10.1109/TPAMI.2002.1017623 Copyright (C) UNSW COMP9517 24T2W3 Feature Representation Part 1 31 Scale-invariant feature transform SIFT feature describes texture in a localised region around a keypoint SIFT descriptor is invariant to various transformations Recognising that this is the same object requires invariance to scaling, rotation, affine distortion, illumination changes… https://www.analyticsvidhya.com/blog/2019/10/ detailed-guide-powerful-sift-technique-image-matching-python/ Copyright (C) UNSW COMP9517 24T2W3 Feature Representation Part 1 32 SIFT algorithm overview Scale-Space Extrema Detection Find maxima/minima in DoG images across scales Keypoint Localization Discard low-contrast keypoints and eliminate edge responses Orientation Assignment Achieve rotation invariance by orientation assignment Keypoint Descriptor Compute gradient orientation histograms Copyright (C) UNSW COMP9517 24T2W3 Feature Representation Part 1 33 SIFT extrema detection Detect maxima and minima in the scale space of the image Gaussian scale 𝜎𝜎 𝐿𝐿 𝑥𝑥, 𝑦𝑦, 𝜎𝜎 = 𝐼𝐼 𝑥𝑥, 𝑦𝑦 ∗ 𝐺𝐺 𝑥𝑥, 𝑦𝑦, 𝜎𝜎 𝐷𝐷 𝑥𝑥, 𝑦𝑦, 𝜎𝜎 = 𝐿𝐿 𝑥𝑥, 𝑦𝑦, 𝑘𝑘𝜎𝜎 − 𝐿𝐿 𝑥𝑥, 𝑦𝑦, 𝜎𝜎 (Fixed factor 𝑘𝑘 between adjacent scales) D. G. Lowe (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision 60(2):91-110. https://doi.org/10.1023/B:VISI.0000029664.99615.94 Copyright (C) UNSW COMP9517 24T2W3 Feature Representation Part 1 34 SIFT keypoint localization Improve and reduce the set of found keypoints – Use 3D quadratic fitting in scale-space to get subpixel optima – Reject low-contrast and edge points using Hessian analysis Initial keypoints from Keypoints after rejecting Final keypoints after scale-space optima low-contrast points rejecting edge points Copyright (C) UNSW COMP9517 24T2W3 Feature Representation Part 1 35 SIFT orientation assignment Estimate keypoint orientation using local gradient vectors – Make an orientation histogram of local gradient vectors – Find the dominant orientation from the main peak of the histogram – Create additional keypoint for second highest peak if >80% Copyright (C) UNSW COMP9517 24T2W3 Feature Representation Part 1 36 SIFT keypoint descriptor Represent each keypoint by a 128D feature vector – 4 x 4 array of gradient histogram weighted by magnitude – 8 bins in gradient orientation histogram – Total 8 x 4 x 4 array = 128 dimensions Copyright (C) UNSW COMP9517 24T2W3 Feature Representation Part 1 37 Example application of SIFT Matching two partially overlapping images Copyright (C) UNSW COMP9517 24T2W3 Feature Representation Part 1 38 Example application of SIFT Matching two partially overlapping images – Compute SIFT keypoints for each image Copyright (C) UNSW COMP9517 24T2W3 Feature Representation Part 1 39 Example application of SIFT Matching two partially overlapping images – Find best match between SIFT keypoints in 128D feature space Copyright (C) UNSW COMP9517 24T2W3 Feature Representation Part 1 40 Descriptor matching Using the nearest neighbour distance ratio (NNDR) 𝑑𝑑1 𝐷𝐷𝐴𝐴 − 𝐷𝐷𝐵𝐵 NNDR = = 𝑑𝑑2 𝐷𝐷𝐴𝐴 − 𝐷𝐷𝐶𝐶 – Distance 𝑑𝑑1 is to the first nearest neighbour – Distance 𝑑𝑑2 is to the second nearest neighbour – Nearest neighbours in 128D feature space – Reject matches with NNDR > 0.8 Copyright (C) UNSW COMP9517 24T2W3 Feature Representation Part 1 41 Example application of SIFT Stitching two partially overlapping images Copyright (C) UNSW COMP9517 24T2W3 Feature Representation Part 1 42 Example application of SIFT Stitching two partially overlapping images – Find SIFT keypoints and feature correspondences Copyright (C) UNSW COMP9517 24T2W3 Feature Representation Part 1 43 Example application of SIFT Stitching two partially overlapping images – Find the right spatial transformation Copyright (C) UNSW COMP9517 24T2W3 Feature Representation Part 1 44 Types of spatial transformation Rigid transformations Translation Rotation Original Nonrigid transformations Scaling Affine Perspective Copyright (C) UNSW COMP9517 24T2W3 Feature Representation Part 1 45 Spatial coordinate transformation 𝑥𝑥𝑥 𝑠𝑠𝑥𝑥 0 𝑥𝑥 𝑥𝑥𝑥 1 𝑟𝑟𝑥𝑥 𝑥𝑥 Scale: = 0 Shear: = 𝑟𝑟 𝑦𝑦𝑦 𝑠𝑠𝑦𝑦 𝑦𝑦 𝑦𝑦𝑦 𝑦𝑦 1 𝑦𝑦 𝑥𝑥 Rotate: 𝑥𝑥𝑥 cos 𝛼𝛼 −sin 𝛼𝛼 𝑥𝑥 Translate: 𝑥𝑥𝑥 1 0 𝑡𝑡𝑥𝑥 = = 0 1 𝑡𝑡 𝑦𝑦 𝑦𝑦𝑦 sin 𝛼𝛼 cos 𝛼𝛼 𝑦𝑦 𝑦𝑦𝑦 𝑦𝑦 1 𝑥𝑥𝑥 𝑎𝑎 𝑏𝑏 𝑐𝑐 𝑥𝑥 𝑥𝑥𝑥 𝑎𝑎 𝑏𝑏 𝑐𝑐 𝑥𝑥 Affine: 𝑦𝑦𝑦 = 𝑑𝑑 𝑒𝑒 𝑓𝑓 𝑦𝑦 Perspective: 𝑦𝑦𝑦 = 𝑑𝑑 𝑒𝑒 𝑓𝑓 𝑦𝑦 1 0 0 1 1 𝑤𝑤𝑤 𝑔𝑔 ℎ 𝑖𝑖 𝑤𝑤 Copyright (C) UNSW COMP9517 24T2W3 Feature Representation Part 1 46 Fitting and alignment Least-squares (LS) fitting of corresponding keypoints 𝐱𝐱 𝑖𝑖 , 𝐱𝐱 𝑖𝑖′ – Find the parameters 𝐩𝐩 of the transformation 𝑇𝑇 that minimize the squared error 𝐸𝐸 𝐸𝐸 = 𝑇𝑇 𝐱𝐱𝑖𝑖 ; 𝐩𝐩 − 𝐱𝐱 𝑖𝑖′ 2 𝑖𝑖 – Example for affine transformation of coordinates 𝐱𝐱 𝑖𝑖 = 𝑥𝑥𝑖𝑖 , 𝑦𝑦𝑖𝑖 into 𝐱𝐱 𝑖𝑖′ = 𝑥𝑥𝑖𝑖′ , 𝑦𝑦𝑖𝑖′ : 𝑎𝑎 𝑥𝑥𝑥 𝑎𝑎 𝑏𝑏 𝑐𝑐 𝑥𝑥 ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ 𝑏𝑏 ⋮ 𝑦𝑦𝑦 = 𝑑𝑑 𝑒𝑒 𝑓𝑓 𝑦𝑦 ⇒ 𝑥𝑥𝑖𝑖 𝑦𝑦𝑖𝑖 0 0 1 0 𝑑𝑑 𝑥𝑥𝑖𝑖′ = ′ 1 0 0 1 1 0 0 𝑥𝑥𝑖𝑖 𝑦𝑦𝑖𝑖 0 1 𝑒𝑒 𝑦𝑦𝑖𝑖 ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ 𝑐𝑐 ⋮ 𝑓𝑓 ⇓ 𝐩𝐩 = 𝐀𝐀𝑇𝑇 𝐀𝐀 −1 𝐀𝐀𝑇𝑇 𝐛𝐛 ⇐ 𝐀𝐀𝐀𝐀 = 𝐛𝐛 Copyright (C) UNSW COMP9517 24T2W3 Feature Representation Part 1 47 Fitting and alignment RANdom SAmple Consensus (RANSAC) fitting – Least-squares fitting is hampered by outliers – Some kind of outlier detection and rejection is needed – Better use a subset of the data and check inlier agreement – RANSAC does this in an iterative way to find the optimum Copyright (C) UNSW COMP9517 24T2W3 Feature Representation Part 1 48 Fitting and alignment RANSAC example (line fitting model) 1. Sample (randomly) the number of points required to fit the model 2. Solve for the model parameters using the samples 3. Score by the fraction of inliers within a preset threshold of the model Repeat 1-3 until the best model is found with high confidence Copyright (C) UNSW COMP9517 24T2W3 Feature Representation Part 1 49 Fitting and alignment RANSAC example (line fitting model) 1. Sample (randomly) the number of points required to fit the model 2. Solve for the model parameters using the samples 3. Score by the fraction of inliers within a preset threshold of the model Repeat 1-3 until the best model is found with high confidence Copyright (C) UNSW COMP9517 24T2W3 Feature Representation Part 1 50 Fitting and alignment RANSAC example (line fitting model) 1. Sample (randomly) the number of points required to fit the model 2. Solve for the model parameters using the samples 3. Score by the fraction of inliers within a preset threshold of the model Repeat 1-3 until the best model is found with high confidence Copyright (C) UNSW COMP9517 24T2W3 Feature Representation Part 1 51 Fitting and alignment RANSAC example (line fitting model) 𝛿𝛿 1. Sample (randomly) the number of points required to fit the model 2. Solve for the model parameters using the samples 3. Score by the fraction of inliers within a preset threshold of the model Repeat 1-3 until the best model is found with high confidence Copyright (C) UNSW COMP9517 24T2W3 Feature Representation Part 1 52 Fitting and alignment RANSAC example (line fitting model) 1. Sample (randomly) the number of points required to fit the model 2. Solve for the model parameters using the samples 3. Score by the fraction of inliers within a preset threshold of the model Repeat 1-3 until the best model is found with high confidence Copyright (C) UNSW COMP9517 24T2W3 Feature Representation Part 1 53 Fitting and alignment summary Estimate the transformation given matched points 𝐴𝐴 and 𝐵𝐵 – Example for translation: 𝑥𝑥𝑖𝑖𝐵𝐵 𝑥𝑥𝑖𝑖𝐴𝐴 𝑡𝑡𝑥𝑥 𝐵𝐵 = 𝐴𝐴 + 𝑡𝑡𝑦𝑦 𝑦𝑦𝑖𝑖 𝑦𝑦𝑖𝑖 B1 A1 B2 A2 B3 A3 Copyright (C) UNSW COMP9517 24T2W3 Feature Representation Part 1 54 Alignment by least squares Estimate the transformation given matched points 𝐴𝐴 and 𝐵𝐵 – Write down the system of equations: 𝐀𝐀𝐀𝐀 = 𝐛𝐛 – Solve for the parameters: 𝐩𝐩 = 𝐀𝐀𝑇𝑇 𝐀𝐀 −1 𝐀𝐀𝑇𝑇 𝐛𝐛 𝑥𝑥1𝐵𝐵 − 𝑥𝑥1𝐴𝐴 1 0 B1 0 1 𝑦𝑦1𝐵𝐵 − 𝑦𝑦1𝐴𝐴 A1 B2 1 0 𝑡𝑡𝑥𝑥 𝑥𝑥2𝐵𝐵 − 𝑥𝑥2𝐴𝐴 A2 = 𝐵𝐵 0 1 𝑡𝑡𝑦𝑦 𝑦𝑦2 − 𝑦𝑦2𝐴𝐴 B3 1 0 𝑥𝑥3𝐵𝐵 − 𝑥𝑥3𝐴𝐴 A3 0 1 𝑦𝑦3𝐵𝐵 − 𝑦𝑦3𝐴𝐴 Copyright (C) UNSW COMP9517 24T2W3 Feature Representation Part 1 55 Alignment by random sample consensus Estimate the transformation given matched points 𝐴𝐴 and 𝐵𝐵 1. Sample a set of matching points (one pair) 2. Solve for transformation parameters Repeat 𝑁𝑁 times 3. Score parameters with number of inliers B1 A1 B2 A2 B3 A3 Copyright (C) UNSW COMP9517 24T2W3 Feature Representation Part 1 56 Summary Feature representation is essential in solving many computer vision problems Most commonly used image features: – Colour features (Part 1) Colour moments and histogram – Texture features (Part 1) Haralick, LBP, SIFT – Shape features (Part 2) Basic, shape context, HOG Copyright (C) UNSW COMP9517 24T2W3 Feature Representation Part 1 57 Summary Other techniques discussed (Part 1) – Descriptor matching – Least squares and RANSAC – Spatial transformations Techniques to be discussed (Part 2) – Feature encoding (Bag-of-Words) – K-means clustering – Shape matching – Sliding window detection Copyright (C) UNSW COMP9517 24T2W3 Feature Representation Part 1 58 Further reading on discussed topics Chapters 4 and 6 of Szeliski Acknowledgements Some content from slides of James Hays, Michael A. Wirth, Cordelia Schmit From BoW to CNN: Two decades of texture representation for texture classification And other resources as indicated by the hyperlinks Copyright (C) UNSW COMP9517 24T2W3 Feature Representation Part 1 59 Example exam question Which one of the following statements about feature descriptors is incorrect? A. Haralick features are derived from gray-level co-occurrence matrices. B. SIFT achieves rotation invariance by computing gradient histograms at multiple scales. C. LBP describes local image texture and can be multiresolution and rotation-invariant. D. Colour moments have lower representation capability than the colour histogram. Copyright (C) UNSW COMP9517 24T2W3 Feature Representation Part 1 60