Face Detection CS 4731 Fall 2024 PDF

Document Details

Uploaded by Deleted User

Columbia University

2024

Shree K. Nayar

Tags

computer vision face detection machine learning image processing

Summary

This document is lecture notes on Face Detection, part of the Computer Vision course CS 4731, Fall 2024 at Columbia University. It covers topics including uses of face detection, Haar features, integral images, nearest neighbor classifier, and Support Vector Machines (SVM).

Full Transcript

Face Detection Computer Vision: CS 4731 Shree K. Nayar Columbia University Fall 2024 What is Face Detection? Locate human faces in images I.5 Face Detection Locate human faces in images. Topics: (1) Uses of Face Detec...

Face Detection Computer Vision: CS 4731 Shree K. Nayar Columbia University Fall 2024 What is Face Detection? Locate human faces in images I.5 Face Detection Locate human faces in images. Topics: (1) Uses of Face Detection (2) Haar Features for Face Detection (3) Integral Image (4) Nearest Neighbor Classifier (5) Support Vector Machine Uses of Face Detection Where is Face Detection Used? I.6 Automatic Selection of Camera Settings (Autofocus, Exposure, Color Balance, etc.) Where is Face Detection Used? Face Detection Finding People using Search Engines Where is Face Detection Used? Only faces of people named “Gates” Finding People using Search Engines Where is Face Detection Used? I.2 Intelligent Marketing Where is Face Detection Used? I.3 Biometrics, Surveillance, Monitoring Haar Features for Face Detection Face Detection in Computers Slide windows of different sizes across image. At each location match window to face model. I.5 Face Detection Framework For each window: Extract Match Features Face Model 𝐟 Yes / No I.7 Features: Which features represent faces well? Classifier: How to construct a face model and efficiently classify features as face or not? What are Good Features? Interest Points (Edges, Corners, SIFT)? I.7 Facial Components (Templates)? Characteristics of Good Features Discriminate Face/Non-Face ≠ I.8 Extremely Fast to Compute Need to evaluate millions of windows in an image Haar Features Set of Correlation Responses to Haar Filters 𝐻! 𝑉![𝑖, 𝑗] 𝐻" 𝑉"[𝑖, 𝑗] ⨂ 𝐻# = 𝑉# [𝑖, 𝑗] 𝐻$ 𝑉$[𝑖, 𝑗] I.5 Input Image ⋮ ⋮ Haar Filters Haar Features 𝐟[𝑖, 𝑗] Discriminative Ability of Haar Feature I.7 𝑉! = 64 𝑉! ≈ 0 𝑉! = 16 𝑉! = −127 Haar Features are Sensitive to Directionality of Patterns Detecting Faces of Different Size Compute Haar Features at different scales to detect faces of different sizes. ⋮ ⋮ ⋮ ⋮ Computing A Haar Feature ⨂ 𝐻! White = 1, Black = -1 I.5 Response to Filter 𝐻! at location (𝑖, 𝑗): 𝑉! 𝑖, 𝑗 = ) ) 𝐼 𝑚, 𝑛 𝐻! 𝑚 − 𝑖, 𝑛 − 𝑗 % & 𝑉! 𝑖, 𝑗 = ∑ (pixel intensities in white area) – ∑ (pixels intensities in black area) Haar Feature: Computation Cost 𝑀 𝑁 I.5 𝑉𝑎𝑙𝑢𝑒 = ∑ 𝑝𝑖𝑥𝑒𝑙 𝑖𝑛𝑡𝑒𝑛𝑠𝑖𝑡𝑖𝑒𝑠 𝑖𝑛 𝑤ℎ𝑖𝑡𝑒 – ∑ 𝑝𝑖𝑥𝑒𝑙 𝑖𝑛𝑡𝑒𝑛𝑠𝑖𝑡𝑖𝑒𝑠 𝑖𝑛 𝑏𝑙𝑎𝑐𝑘 Computation cost = (𝑁×𝑀 − 1) additions per pixel, per filter, per scale Can We Do Better? Integral Image Integral Image A table that holds the sum of all pixel values to the left and top of a given pixel, inclusive. 98 110 121 125 122 129 98 208 329 454 576 705 99 110 120 116 116 129 197 417 658 899 1137 1395 97 109 124 111 123 134 294 623 988 1340 1701 2093 98 112 132 108 123 133 392 833 1330 1790 2274 2799 97 113 147 108 125 142 489 1043 1687 2255 2864 3531 95 111 168 122 130 137 584 1249 2061 2751 3490 4294 96 104 172 130 126 130 680 1449 2433 3253 4118 5052 Image 𝐼 Integral Image 𝐼𝐼 [Crow 1985, Viola 2002] Integral Image A table that holds the sum of all pixel values to the left and top of a given pixel, inclusive. 98 110 121 125 122 129 98 208 329 454 576 705 99 110 120 116 116 129 197 417 658 899 1137 1395 97 109 124 111 123 134 294 623 988 1340 1701 2093 98 112 132 108 123 133 392 833 1330 1790 2274 2799 97 113 147 108 125 142 489 1043 1687 2255 2864 3531 95 111 168 122 130 137 584 1249 2061 2751 3490 4294 96 104 172 130 126 130 680 1449 2433 3253 4118 5052 Image 𝐼 Integral Image 𝐼𝐼 Integral Image A table that holds the sum of all pixel values to the left and top of a given pixel, inclusive. 98 110 121 125 122 129 98 208 329 454 576 705 99 110 120 116 116 129 197 417 658 899 1137 1395 97 109 124 111 123 134 294 623 988 1340 1701 2093 98 112 132 108 123 133 392 833 1330 1790 2274 2799 97 113 147 108 125 142 489 1043 1687 2255 2864 3531 95 111 168 122 130 137 584 1249 2061 2751 3490 4294 96 104 172 130 126 130 680 1449 2433 3253 4118 5052 Image 𝐼 Integral Image 𝐼𝐼 Summation Within a Rectangle Fast summations of arbitrary rectangles using integral images 98 110 121 125 122 129 98 208 329 454 576 705 99 110 120 116 116 129 197 417 658 899 1137 1395 97 109 124 111 123 134 294 623 988 1340 1701 2093 98 112 132 108 123 133 392 833 1330 1790 2274 2799 97 113 147 108 125 142 489 1043 1687 2255 2864 3531 95 111 168 122 130 137 584 1249 2061 2751 3490 4294 96 104 172 130 126 130 680 1449 2433 3253 4118 5052 Image 𝐼 Integral Image 𝐼𝐼 Summation Within a Rectangle Fast summations of arbitrary rectangles using integral images 98 110 121 125 122 129 98 208 329 454 576 705 99 110 120 116 116 129 197 417 658 899 1137 1395 97 109 124 111 123 134 294 623 988 1340 1701 2093 98 112 132 108 123 133 392 833 1330 1790 2274 2799 97 113 147 108 125 142 489 1043 1687 2255 2864 3531 95 111 168 122 130 137 584 1249 2061 2751 3490 4294 𝑃 96 104 172 130 126 130 680 1449 2433 3253 4118 5052 Image 𝐼 Integral Image 𝐼𝐼 𝑆𝑢𝑚 = 𝐼𝐼' + ⋯ = 3490 + ⋯ Summation Within a Rectangle Fast summations of arbitrary rectangles using integral images 98 110 121 125 122 129 98 208 329 454 576 705 99 110 120 116 116 129 197 417 658 899 1137 1395 97 109 124 111 123 134 294 623 988 1340 1701 2093 𝑄 98 112 132 108 123 133 392 833 1330 1790 2274 2799 97 113 147 108 125 142 489 1043 1687 2255 2864 3531 95 111 168 122 130 137 584 1249 2061 2751 3490 4294 𝑃 96 104 172 130 126 130 680 1449 2433 3253 4118 5052 Image 𝐼 Integral Image 𝐼𝐼 𝑆𝑢𝑚 = 𝐼𝐼' − 𝐼𝐼( + ⋯ = 3490 – 1137 + ⋯ Summation Within a Rectangle Fast summations of arbitrary rectangles using integral images 98 110 121 125 122 129 98 208 329 454 576 705 99 110 120 116 116 129 197 417 658 899 1137 1395 97 109 124 111 123 134 294 623 988 1340 1701 2093 𝑄 98 112 132 108 123 133 392 833 1330 1790 2274 2799 97 113 147 108 125 142 489 1043 1687 2255 2864 3531 95 111 168 122 130 137 584 1249 2061 2751 3490 4294 𝑆 𝑃 96 104 172 130 126 130 680 1449 2433 3253 4118 5052 Image 𝐼 Integral Image 𝐼𝐼 𝑆𝑢𝑚 = 𝐼𝐼' − 𝐼𝐼( − 𝐼𝐼) + ⋯ = 3490 – 1137 – 1249 + ⋯ Summation Within a Rectangle Fast summations of arbitrary rectangles using integral images 98 110 121 125 122 129 98 208 329 454 576 705 99 110 120 116 116 129 197 417 658 899 1137 1395 𝑅 𝑄 97 109 124 111 123 134 294 623 988 1340 1701 2093 98 112 132 108 123 133 392 833 1330 1790 2274 2799 97 113 147 108 125 142 489 1043 1687 2255 2864 3531 95 111 168 122 130 137 584 1249 2061 2751 3490 4294 𝑆 𝑃 96 104 172 130 126 130 680 1449 2433 3253 4118 5052 Image 𝐼 Integral Image 𝐼𝐼 𝑆𝑢𝑚 = 𝐼𝐼' − 𝐼𝐼( − 𝐼𝐼) + 𝐼𝐼* = 3490 – 1137 – 1249 + 417 = 1521 Computational Cost: Only 3 additions Haar Response using Integral Image 98 110 121 125 122 129 98 208 329 454 576 705 99 110 120 116 116 129 197 417 658 899 1137 1395 97 109 124 111 123 134 294 623 988 1340 1701 2093 98 112 132 108 123 133 392 833 1330 1790 2274 2799 97 113 147 108 125 142 489 1043 1687 2255 2864 3531 95 111 168 122 130 137 584 1249 2061 2751 3490 4294 96 104 172 130 126 130 680 1449 2433 3253 4118 5052 Image 𝐼 Integral Image 𝐼𝐼 𝑉! = ∑ 𝑝𝑖𝑥𝑒𝑙𝑠 𝑖𝑛 𝑤ℎ𝑖𝑡𝑒 – ∑ 𝑝𝑖𝑥𝑒𝑙𝑠 𝑖𝑛 𝑏𝑙𝑎𝑐𝑘 Haar Response using Integral Image 𝑇 98 110 121 125 122 129 98 208 329 454 576 705 99 110 120 116 116 129 𝑅 197 417 658 899 1137 1395 𝑄 97 109 124 111 123 134 294 623 988 1340 1701 2093 98 112 132 108 123 133 392 833 1330 1790 2274 2799 97 113 147 108 125 142 489 1043 1687 2255 2864 3531 95 111 168 122 130 137 584 1249 2061 2751 3490 4294 96 104 172 130 126 130 𝑆 680 1449 2433 3253 4118 5052 𝑃 𝑂 Image 𝐼 Integral Image 𝐼𝐼 𝑉! = ∑ 𝑝𝑖𝑥𝑒𝑙 𝑖𝑛𝑡𝑒𝑛𝑠𝑖𝑡𝑖𝑒𝑠 𝑖𝑛 𝑤ℎ𝑖𝑡𝑒 – ∑ 𝑝𝑖𝑥𝑒𝑙 𝑖𝑛𝑡𝑒𝑛𝑠𝑖𝑡𝑖𝑒𝑠 𝑖𝑛 𝑏𝑙𝑎𝑐𝑘 = 𝐼𝐼+ − 𝐼𝐼, + 𝐼𝐼* − 𝐼𝐼) − 𝐼𝐼' − 𝐼𝐼( + 𝐼𝐼, − 𝐼𝐼+ = (2061– 329 + 98– 584) – (3490– 576 + 329– 2061) = 64 Computational Cost: Only 7 additions Computing Integral Image Raster Scanning D B C A Let 𝐼4 and 𝐼𝐼4 be the values of Image and Integral Image, respectively, at pixel 𝐴. 𝐼𝐼4 = 𝐼𝐼5 + 𝐼𝐼6 − 𝐼𝐼7 + 𝐼4 Haar Features Using Integral Images Integral image needs to be computed once per test image. Allows fast computations of Haar features. 𝐻! 𝑉![𝑖, 𝑗] 𝐻" 𝑉"[𝑖, 𝑗] ⨂ 𝐻# = 𝑉# [𝑖, 𝑗] 𝐻$ 𝑉$[𝑖, 𝑗] I.5 Input Image ⋮ ⋮ Haar Filters Haar Features Nearest Neighbor Classifier Classifier for Face Detection Given the features for a window, how to decide whether it contains a face or not? ? ? ? ? I.5 Feature Space Haar features 𝐟 (a vector) at a pixel is a point in an n-D space, 𝐟 ∈ ℝ𝒏 𝑓- 𝑓. 𝑓& I.8 Training Data Training Data of Face of Non-Face Classifier Decision engine that classifies a test image as face or not 𝑓- 𝑓. 𝑓& I.8 Training Data Training Data of Face Test Image of Non-Face Nearest Neighbor Classifier Find the nearest training sample using ℒ # distance and assign its label. 𝑓- 𝑓. 𝑓& I.8 Training Data Training Data of Face Face of Non-Face Nearest Neighbor Classifier Find the nearest training sample using ℒ # distance and assign its label. 𝑓- 𝑓. 𝑓& I.8 Training Data Training Data of Face Not Face of Non-Face Nearest Neighbor Classifier Find the nearest training sample using ℒ # distance and assign its label. 𝑓- 𝑓. 𝑓& I.8 Training Data Training Data of Face Face of Non-Face False Positive Nearest Neighbor Classifier Larger the training set, more robust the NN classifier 𝑓- 𝑓. 𝑓& I.8 Training Data Training Data of Face Non-Face of Non-Face Nearest Neighbor Classifier Larger the training set, slower the NN classifier 𝑓- 𝑓. 𝑓& I.8 Training Data Training Data of Face Non-Face of Non-Face Decision Boundary A simple decision boundary separating face and non-face classes will suffice. 𝑓- 𝑓. 𝑓& I.8 Training Data Training Data of Face of Non-Face Support Vector Machine Linear Decision Boundaries A Linear Decision Boundary in 2-D space is a 1-D Line 𝑓- Equation of Line: 𝑤:𝑓: + 𝑤;𝑓; + 𝑏 = 0 𝑓: 𝐰, 𝐟 + 𝑏 > 0 𝑤: 𝑤; +𝑏 =0 𝑓; 𝑓. 𝐰 0 𝑤:𝑓: + 𝑤;𝑓; + 𝑤=𝑓= + 𝑏 = 0 𝐰 0 𝑤:𝑓: + 𝑤;𝑓; + ⋯ + 𝑤> 𝑓> + 𝑏 = 0 𝐰 0 Non-Faces 𝐰, 𝐟 + 𝑏 < 0 Evaluating a Decision Boundary Margin Margin or Safe Zone: The width that the boundary could be increased by, before hitting a feature point. Evaluating a Decision Boundary Margin I Margin II + + ° ° °° °° ° ° ° ° ° ° ° ° ° ° Decision I: Face Decision II: Non-Face Choose Decision Boundary with Maximum Margin! Support Vector Machine (SVM) Classifier optimized to Maximize Margin Margin Support Vectors: Closest data samples to the boundary Decision Boundary & Margin depend only on Support Vectors [Cortes 1995] Support Vector Machine (SVM) Given: 𝑘 training images 𝐼$ , 𝐼# , … , 𝐼% and their Haar features 𝐟$ , 𝐟# , … , 𝐟%. 𝑘 corresponding labels {𝜆$ , 𝜆# , … , 𝜆% }, where 𝜆& = +1 if 𝐼& is a face and 𝜆& = −1 if 𝐼& is not a face. Margin 𝜌 Find: Decision Boundary 𝐰 , 𝐟 + 𝑏 = 0 with Maximum Margin 𝜌 𝐰, 𝐟 + 𝑏 = 0 [Cortes 1995] Finding Decision Boundary (𝑾, 𝑏) For each training sample (𝐟' , 𝜆' ): If 𝜆' = +1: 𝐰 ( 𝐟' + 𝑏 ≥ 𝜌/2 𝜆' 𝐰 ( 𝐟' + 𝑏 ≥ 𝜌/2 If 𝜆' = −1: 𝐰 ( 𝐟' + 𝑏 ≤ −𝜌/2 Margin 𝜌 𝐰 , 𝐟 + 𝑏 > 𝜌/2 𝐰 , 𝐟 + 𝑏 < −𝜌/2 𝐰 , 𝐟 + 𝑏 = 𝜌/2 𝐰, 𝐟 + 𝑏 = 0 𝐰 , 𝐟 + 𝑏 = −𝜌/2 Finding Decision Boundary (𝑾, 𝑏) For each training sample (𝐟' , 𝜆' ): If 𝜆' = +1: 𝐰 ( 𝐟' + 𝑏 ≥ 𝜌/2 𝜆' 𝐰 ( 𝐟' + 𝑏 ≥ 𝜌/2 ( If 𝜆' = −1: 𝐰 𝐟' + 𝑏 ≤ −𝜌/2 If 𝒮 is the set of support vectors, Then for every support vector 𝑠 ∈ 𝒮: 𝜆) 𝐰 ( 𝐟) + 𝑏 = 𝜌/2 Numerical methods exist to find 𝐰, 𝑏 and 𝒮 that maximize 𝜌 MATLAB: svmtrain [Cortes 1995] Classification using SVM Given: Haar features 𝐟 for an image window and SVM parameters 𝐰, 𝑏, 𝜌, 𝒮 Classification: Compute 𝑑 = 𝐰 ( 𝐟 + 𝑏 𝑑 ≥ 𝜌⁄2 Face 𝑑 > 0 𝑎𝑛𝑑 𝑑 < 𝜌⁄2 Probably Face If: 𝑑 < 0 𝑎𝑛𝑑 𝑑 > −𝜌⁄2 Probably Not-Face 𝑑 ≤ −𝜌⁄2 Not-Face Face Detection Results I.4 Remarks Ÿ Successful vision technology used in cameras, surveillance, biometrics, search. Ÿ Performance continues to improve. References and Credits References: Papers [Burges 1998] C. J. C. Burges. “A Tutorial on Support Vector Machines for Pattern Recognition”. 1998. [Cortes 1995] C. Cortes and V. N. Vapnik. "Support-Vector Networks". Machine Learning 1995. [Crow 1985] F. Crow. "Summed-area tables for texture mapping". SIGGRAPH 1984. [Freund 1996] Y. Freund and R. E. Schapire. “Experiments with a new boosting algorithm”. 1996. [Viola 2004] P. Viola and M. Jones. “Robust real-time face detection”. 2004. Image Credits I.1 http://dailymail.co.uk/sciencetech/article-1339112/Facebook-facial-recognition-. software-suggest-friends-tagging-new-photos.html Associated Newspapers Ltd. I.2 http://www.designboom.com/design/acure-digital-vending-machine/ I.3 https://www.youtube.com/watch?v=meRSKCS0d-A Herta Security. I.4 M. Hruby. http://mhr3.blogspot.com/2012/03/face-detection-with-opencl.html Used with permission. I.5 Purchased from iStock by Getty Images. I.6 Purchased from iStock by Getty Images. I.7 Purchased from iStock by Getty Images. I.8 Purchased from iStock by Getty Images.

Use Quizgecko on...
Browser
Browser