Computer Vision - Stereo Ch5 PDF
Document Details
Uploaded by BeneficialAntigorite6228
Al-Balqa' Applied University (BAU)
Dr. Ashraf ALDabbas
Tags
Summary
This document provides an overview of stereo vision in computer vision. It discusses various aspects including stereo reconstruction, cues for 3D reconstruction, stereo matching, and associated challenges. The content aims to give broad introductory knowledge for learners of computer vision.
Full Transcript
Ch5: Computer Vision- Stereo Dr. Ashraf ALDabbas Outline Cues for 3D reconstruction Stereo Cues Stereo Reconstruction 1) camera calibration and rectification an easier, mostly solved problem 2) stereo correspondence a harder problem What is Ste...
Ch5: Computer Vision- Stereo Dr. Ashraf ALDabbas Outline Cues for 3D reconstruction Stereo Cues Stereo Reconstruction 1) camera calibration and rectification an easier, mostly solved problem 2) stereo correspondence a harder problem What is Stereo in computer vision In computer vision, stereo refers to stereo vision, a technique that mimics human binocular vision to perceive depth and 3D structure from two or more images taken from slightly different viewpoints. It is inspired by the way humans use their two eyes to estimate distances to objects in the real world. Key Components of Stereo Vision 1. Two Images (Stereo Pair): 1. Captured from two cameras positioned slightly apart (known as the baseline distance). 2. These images represent the same scene but from different perspectives. 2. Disparity: 1. The difference in the position of an object in the two images. 2. Nearby objects have a larger disparity, while distant objects have a smaller disparity. 3. Depth Estimation: 1. Depth is computed using the disparity, camera parameters, and the geometry of the stereo setup. 2. This is based on triangulation principles. Applications of Stereo Vision 1.3D Reconstruction: Recreating a 3D model of a scene or object. 2.Autonomous Vehicles: Detecting obstacles, estimating distances, and understanding the 3D layout of the environment. 3.Robotics: Enabling robots to perceive depth for navigation and manipulation. 4.AR/VR: Creating immersive environments that require depth perception. 5.Medical Imaging: For applications like reconstructing 3D models of organs or tissues. 6.Object Tracking and Gesture Recognition: Recognizing and analyzing movements in 3D space. Common Challenges Occlusions: When parts of the scene are visible in one image but not the other. Matching Ambiguity: Difficulty in matching corresponding points between the two images, especially in areas with little texture. Lighting and Reflection Variations: Differences in lighting or reflections between the two viewpoints. Algorithms and Techniques Stereo Matching: Finding corresponding points in the stereo pair. Methods: Block matching, Semi-Global Matching (SGM), or learning-based approaches. Depth Map Generation: Creating a dense representation of the depth of each pixel. Stereo vision is foundational for many real-world systems and is a key step toward achieving machine perception. Binocular Stereo What is this? Single image stereogram, https://en.wikipedia.org/wiki/Autostereogram The previous image is an example of a Single Image Stereogram or Autostereogram, a technique used to create a 3D illusion from a 2D image. https://giphy.com/gifs/wigglegram-706pNfSKyaDug Stereo Vision as Localizing Points in 3D An object point will project to some point in our image That image point corresponds to a ray in the world Two rays intersect at a single point, so if we want to localize points in 3D we need 2 eyes Why Two Eyes? Charles Wheatstone first explained stereopsis in 1838 left image right image (x,y) (x-d,y) left eye right eye 3D Scene Why Two Eyes? Disparity d is the difference in x coordinates of corresponding points left image right image (x,y) (x-d,y) left eye right eye 3D Scene 3D Shape from Stereo Use two cameras instead of two eyes left image right image (x,y) (x-d,y) left camera right camera 3D Scene Stereo Given two images from different viewpoints – How can we compute the depth of each point in the image? – Based on how much each pixel moves between the two images Epipolar geometry epipolar lines (x1, y1) (x2, y1) Two images captured by a purely horizontal translating camera (rectified stereo pair) x2 - x1 = the disparity of pixel (x1, y1) Disparity = inverse depth http://stereo.nypl.org/view/41729 (Or, hold a finger in front of your face and wink each eye in succession.) Your basic stereo matching algorithm Match Pixels in Conjugate Epipolar Lines – Assume brightness constancy – This is a challenging problem – Hundreds of approaches A good survey and evaluation: http://www.middlebury.edu/stereo/ Your basic stereo matching algorithm For each epipolar line For each pixel in the left image compare with every pixel on same epipolar line in right image pick pixel with minimum match cost Improvement: match windows Stereo matching based on SSD Sum of Squared Differences) SSD dmin d Best matching disparity Window size W=3 W = 20 Effect of window size Better results with adaptive window – Smaller window T. Kanade and M. Okutomi, A Stereo Matching Algorithm with an + more detail Adaptive Window: Theory and Experiment, ICRA 1991. - more noise D. Scharstein and R. Szeliski. Stereo matching with nonlinear – Larger window diffusion. IJCV, July 1998 + less noise - less detail Stereo results – Data from University of Tsukuba – Similar results on other images without ground truth Scene Ground truth Stereo as energy minimization What defines a good stereo correspondence? 1. Match quality Want each pixel to find a good match in the other image 2. Smoothness If two pixels are adjacent, they should (usually) move about the same amount Stereo as energy minimization Find disparity map d that minimizes an energy function Simple pixel / window matching = SSD distance between windows I(x, y) and J(x + d(x,y), y) This previous slide is about stereo vision in computer vision, where the goal is to compute the disparity map d(x,y), which represents the difference in positions of corresponding points in two stereo images. This disparity map is critical for reconstructing 3D information from 2D images. Explanation: 1.Objective: The task is to find a disparity map d(x,y) that minimizes an energy function E(d). This energy function represents the cost associated with a particular disparity configuration. 2. Energy Function: The energy function is given as: Stereo as energy minimization y = 141 d x C(x, y, d); the disparity space image (DSI) Stereo System Unlike eyes, usually stereo cameras are not on the same plane better numerical stability 3D scene point optical center optical center left camera right camera Stereo System: Triangulation 3D scene point P optical center optical center left camera right camera Depth by triangulation given two corresponding points in the left and right image cast the rays through the optical camera centers ray intersection is the corresponding 3D world point P depth of P is based on camera positions and parameters Triangulation ideas can be traced to ancient Greece document from 1533 What is needed for Triangulation 1. Distance between cameras, camera focal length Solved through camera calibration, essentially a solved problem We will not talk about it Code available on the web OpenCV http://www.intel.com/research/mrl/research/opencv/ Zhengyou Zhang http://research.microsoft.com/~zhang/Calib/ 2. Pairs of corresponding pixels in left and right images Called stereo correspondence problem, still much researched Depth from disparity X z x x’ f f C baseline C’ Formula: Depth from Disparity Top down view on geometry (slice through XZ plane) from camera calibration, know the distance between camera optical centers called baseline B, and camera focal length f P = (X,Y,Z) Z right image left image point point Z xl xr f f Cl Cr X left optical right optical center baseline B center Formula: Depth from Disparity Height to base ratio of triangle Cl PCr : Z B P = (X,Y,Z) Z right image left image point point Z xl xr f f Cl Cr X left optical right optical center baseline B center Formula: Depth from Disparity Z-f Height to base ratio of triangle xl Px r : B - xl + xr xl is positive, xr is negative P = (X,Y,Z) Z right image left image point point Z xl xr f f Cl Cr X left optical right optical center baseline B center Key Points: 1.Height-to-Base Ratio: 1. The ratio for the triangle ΔxlPxr, formed by the projections of the 3D point PPP, is given as: Formula: Depth from Disparity Cl PCr and xl P xr are similar: Z = Z-f B B - xl + xr P = (X,Y,Z) Z right image left image point point Z xl xr f f Cl Cr X left optical right optical center baseline B center Formula: Depth from Disparity Rewriting: Z = B f xl - xr xl - xr is the disparity P = (X,Y,Z) Z right image left image point point Z xl xr f f Cl Cr X left optical right optical center baseline B center Stereo Correspondence: Epipolar Lines Which pairs of pixels correspond to the same scene element ? optical center optical center left camera right camera Epipolar constraint Given a left image pixel, the corresponding pixel in the right image must lie on a line called the epipolar line reduces correspondence to 1D search along conjugate epipolar lines demo: http://www.ai.sri.com/~luong/research/Meta3DViewer/EpipolarGeo.html Stereo Rectification Epipolar lines can be computed from camera calibration Usually they are not horizontal Can rectify stereo pair to make epipolar lines horizontal Stereo Correspondence (x,y) (x-d,y) left image right image From now on assume stereo pair is rectified How to solve the correspondence problem? Corresponding pixels should be similar in intensity or color, or something else Difficulties in Stereo Correspondence Image noise corresponding pixels have similar, but not exactly the same intensities 90 90 98 left image patch right image patch Matching each pixel individually is unreliable Difficulties in Stereo Correspondence regions with (almost) constant intensity ? ? ? Matching each pixel individually is unreliable Window Matching Correspondence Use a window (patch) of pixels more likely to have enough intensity variation to form a distinguishable pattern also more robust to noise Window Matching Correspondence Use a window (patch) of pixels more likely to have enough intensity variation to form a distinguishable pattern also more robust to noise Window Matching: Basic Algorithm for each epipolar line for each pixel p on the left line compare window around p with same window shifted to many right window locations on corresponding epipolar line pick location corresponding to the best matching window Which Locations to Try? (x,y) (x-maxDisp,y) (x-1,y)(x,y) Disparity cannot be negative Maximum possible disparity is limited by the camera setup assume we know maxDisp Disparity can range from 0 to maxDisp consider only (x,y), (x-1,y),…(x-maxDisp,y) in the right image Window Matching Cost How to define the best matching window? Define window cost sum of squared differences (SSD) or sum of absolute differences (SAD) many other possibilities Pick window of best (smallest) cost SSD Window Cost left image right image 3 5 4 4 2 4 2 3 5 4 4 2 4 2 7 4 1 4 4 2 6 7 4 1 4 4 2 6 2 7 46 46 46 6 7 46 46 46 3 6 6 7 5 9 46 46 44 9 7 48 46 44 6 4 9 7 4 7 47 47 47 2 4 47 47 47 7 4 2 4 4 7 56 56 46 6 7 58 56 46 5 6 6 7 3 4 4 1 4 3 2 3 4 4 1 4 3 2 (46− 44)2 + (46−6)2 + (44 − 4)2 + (47− 47)2 + (47−7)2 + (47− 4)2 + (56− 46)2 + (56− 5)2 + (46−6)2 =12454 Algorithm with SSD Window Cost left image right image 3 5 4 4 2 4 2 3 5 4 4 2 4 2 7 4 1 4 4 2 6 7 4 1 4 4 2 6 2 7 46 46 46 6 7 46 46 46 3 6 6 7 5 9 46 46 44 9 7 48 46 44 6 4 9 7 4 7 47 47 47 2 4 47 47 47 7 4 2 4 4 7 56 56 46 6 7 58 56 46 5 6 6 7 3 4 4 1 4 3 2 3 4 4 1 4 3 2 (46− 44)2 + (46−6)2 + (44 − 4)2 + (47− 47)2 + (47−7)2 + (47− 4)2 + (56− 46)2 + (56− 5)2 + (46−6)2 =12454 This shift corresponds to disparity 0 Algorithm with SSD Window Cost left image right image 3 5 4 4 2 4 2 3 5 4 4 2 4 2 7 4 1 4 4 2 6 7 4 1 4 4 2 6 2 7 46 46 46 6 7 46 46 46 3 6 6 7 5 9 46 46 44 9 7 48 46 44 6 4 9 7 4 7 47 47 47 2 4 47 47 47 7 4 2 4 4 7 56 56 46 6 7 58 56 46 5 6 6 7 3 4 4 1 4 3 2 3 4 4 1 4 3 2 (46− 46)2 + (46− 44)2 + (44 −6)2 + (47− 47)2 + (47−7)2 + (47−7)2 + (56− 56)2 + (56− 46)2 + (46− 5)2 = 6425 This shift corresponds to disparity 1 Algorithm with SSD Window Cost left image right image 3 5 4 4 2 4 2 3 5 4 4 2 4 2 7 4 1 4 4 2 6 7 4 1 4 4 2 6 2 7 46 46 46 6 7 46 46 46 3 6 6 7 5 9 46 46 44 9 7 48 46 44 6 4 9 7 4 7 47 47 47 2 4 47 47 47 7 4 2 4 4 7 56 56 46 6 7 58 56 46 5 6 6 7 3 4 4 1 4 3 2 3 4 4 1 4 3 2 (46− 48)2 + (46− 46)2 + (44 − 44)2 + (47− 47)2 + (47− 47)2 + (47− 47)2 + (56− 58)2 + (56− 56)2 + (46− 46)2 = 8 This shift corresponds to disparity 2 Algorithm with SSD Window Cost left image right image 3 5 4 4 2 4 2 3 5 4 4 2 4 2 7 4 1 4 4 2 6 7 4 1 4 4 2 6 2 7 46 46 46 6 7 46 46 46 3 6 6 7 5 9 46 46 44 9 7 48 46 44 6 4 7 47 47 47 2 4 47 47 47 7 4 2 4 4 7 56 56 46 6 7 58 56 46 5 6 6 7 3 4 4 1 4 3 2 3 4 4 1 4 3 2 8 6425 12454 Best SSD window cost is 8 at disparity 2 Red pixel is assigned disparity 2 Repeat this for all image pixels Correspondence with SSD Matching SSD cost Unique best cost location disparity Compare to One Pixel “Window” SSD cost No unique best cost location disparity SAD Window Cost SSD is fragile to outliers 1 1 10 1 1 10 1 1 10 31 31 31 1 1 10 1 1 10 1 1 10 31 31 31 1 1 19 1 1 99 1 1 19 31 31 29 SSD cost = 802 = 6400 SSD cost = 6384 best SAD (Sum of Absolute Differences) is more robust 1 1 10 1 1 10 1 1 10 31 31 31 1 1 10 1 1 10 1 1 10 31 31 31 1 1 19 1 1 99 1 1 19 31 31 29 SAD cost = 80 best SAD cost = 232 Definition of SAD: Example 1: Comparing Windows Example 2: Comparing Windows Window Matching Efficency Suppose image has n pixels matching window is 11 by 11 Need 1111 = 121 additions and multiplications to compute one window cost Multiply that by number of locations to check (maxDisp+1) Multiply that by n image pixels 121 n (maxDisp+1) Tooooo sloooow gets worse for larger windows Can get cost down to n (maxDisp+1) with integral images Speedups: Integral Image Given image f(x,y), the integral image I(x,y) is the sum of values in f(x,y) to the left and above (x,y), including (x,y) 0 0 0 5 5 0 0 0 5 10 0 0 5 5 5 0 0 5 15 25 0 5 5 5 10 0 5 15 30 50 5 5 5 10 0 5 15 30 55 75 5 5 10 0 0 10 25 50 75 95 f(x,y) I(x,y) Example: I(2,2) = 0 + 0 + 0 + 0 + 0 + 5 + 0 + 5 + 5 = 15 Speedups: Integral Image Given image f(x,y), the integral image I(x,y) is the sum of values in f(x,y) to the left and above (x,y), including (x,y) 0 0 0 5 5 0 0 0 5 10 0 0 5 5 5 0 0 5 15 25 0 5 5 5 10 0 5 15 30 50 5 5 5 10 0 5 15 30 55 75 5 5 10 0 0 10 25 50 75 95 f(x,y) I(x,y) Example: I(4,1) = 0 + 0 + 0 + 5 + 5 + 0 + 0 +5 + 5 + 5 = 25 Efficiently Computing Integral Image Suppose computed integral image up to location (x,y) I(x,y) = f(x,y) 0 0 0 5 5 0 0 5 5 5 0 5 5 5 10 5 5 5 10 0 + 5 5 10 0 0 f(x,y) I(x,y) Efficiently Computing Integral Image Suppose computed integral image up to location (x,y) I(x,y) = f(x,y) + I(x-1,y) 0 0 0 5 5 + + 0 0 5 5 5 + + 0 5 5 5 10 + + 5 5 5 10 0 + + + 5 5 10 0 0 f(x,y) I(x,y) Efficiently Computing Integral Image Suppose computed integral image up to location (x,y) I(x,y) = f(x,y) + I(x-1,y) + I(x,y-1) 0 0 0 5 5 ++ ++ + 0 0 5 5 5 ++ ++ + 0 5 5 5 10 ++ ++ + 5 5 5 10 0 + + + 5 5 10 0 0 f(x,y) I(x,y) Efficiently Computing Integral Image Suppose computed integral image up to location (x,y) I(x,y) = f(x,y) + I(x-1,y) + I(x,y-1) - I(x-1,y-1) 0 0 0 5 5 _++ _++ + 0 0 5 5 5 _++ _++ + 0 5 5 5 10 _++ _++ + 5 5 5 10 0 + + + 5 5 10 0 0 f(x,y) I(x,y) Integral Image: Order of Computation Convenient order of computation 1. first row 2. first column 3. the rest in row-wise fashion 1 2 3 4 5 6 10 11 12 13 7 14 15 16 17 8 18 19 20 21 9 22 23 24 25 I(x,y) Using Integral Image After computed integral image, sum over any rectangular window is computed with four operations Top left corner (x1,y1) and bottom right corner (x2,y2) I(x2,y2) 0 0 0 5 5 + + + + + 0 0 5 5 5 + + + + + 0 5 5 5 10 + + + + + 5 5 5 10 0 + + + + + 5 5 10 0 0 f(x,y) I(x,y) Using Integral Image After computed integral image, sum over any rectangular window is computed with four operations Top left corner (x1,y1) and bottom right corner (x2,y2) I(x2,y2) - I(x1-1,y2) 0 0 0 5 5 -+ -+ + + + 0 0 5 5 5 -+ -+ + + + 0 5 5 5 10 -+ -+ + + + 5 5 5 10 0 -+ -+ + + + 5 5 10 0 0 f(x,y) I(x,y) Using Integral Image After computed integral image, sum over any rectangular window is computed with four operations Top left corner (x1,y1) and bottom right corner (x2,y2) I(x2,y2) - I(x1-1,y2) - I(x2,y1-1) 0 0 0 5 5 -+- -+- +- +- - +- 0 0 5 5 5 -+- -+- +- +- + 0 5 5 5 10 - + -+ + + + 5 5 5 10 0 - + -+ + + + 5 5 10 0 0 f(x,y) I(x,y) Using Integral Image After computed integral image, sum over any rectangular window is computed with four operations Top left corner (x1,y1) and bottom right corner (x2,y2) I(x2,y2) - I(x1-1,y2) - I(x2,y1-1) + I(x1-1,y1-1) +-+- +-+- - 0 0 0 5 5 +- +- + 0 0 5 5 5 -++- +-+- +- +- -+ 0 5 5 5 10 - + -+ + + + 5 5 5 10 0 - + -+ + + + 5 5 10 0 0 f(x,y) I(x,y) Using Integral Image After computed integral image, sum over any rectangular window is computed with four operations Top left corner (x1,y1) and bottom right corner (x2,y2) I(x2,y2) - I(x1-1,y2) - I(x2,y1-1) + I(x1-1,y1-1) 0 0 0 5 5 0 0 0 5 10 0 0 5 5 5 0 0 5 15 25 0 5 5 5 10 0 5 15 30 50 5 5 5 10 0 5 15 30 55 75 5 5 10 0 0 10 25 50 75 95 f(x,y) I(x,y) Example: 5 + 5 +10 + 5 + 10 + 0 = 75 -15 - 25 + 0 = 35 Integral Image for Window Matching Assume SAD (sum of absolute differences) cost Need to find SAD for every pixel and every disparity in a window left image right image 3 5 4 4 2 4 2 3 5 4 4 2 4 2 7 4 1 4 4 2 6 7 4 1 4 4 2 6 2 7 46 46 46 6 7 46 46 46 3 6 6 7 5 9 46 46 44 9 7 48 46 44 6 4 7 47 47 47 2 4 47 47 47 7 4 2 4 4 7 56 56 46 6 7 58 56 46 5 6 6 7 3 4 4 1 4 3 2 3 4 4 1 4 3 2 4 186 256 Integral Image for Window Matching for each pixel p for every disparity d compute cost between window around p in the left image and the same window shifted by d in the right image pick d corresponding to the best matching window left image right image 3 5 4 4 2 4 2 3 5 4 4 2 4 2 7 4 1 4 4 2 6 7 4 1 4 4 2 6 2 7 46 46 46 6 7 46 46 46 3 6 6 7 5 9 46 46 44 9 7 48 46 44 6 4 9 7 4 7 47 47 47 2 4 47 47 47 7 4 2 4 4 7 56 56 46 6 7 58 56 46 5 6 6 7 3 4 4 1 4 3 2 3 4 4 1 4 3 2 Integral Image for Window Matching For each disparity d need to compute window cost for all pixels, eventually For example, pick disparity d = 1 left image right image 3 5 4 4 2 4 2 3 5 4 4 2 4 2 7 4 1 4 4 2 6 7 4 1 4 4 2 6 2 7 46 46 46 6 7 46 46 46 3 6 6 7 5 9 46 46 44 9 7 48 46 44 6 4 9 7 4 7 47 47 47 2 4 47 47 47 7 4 2 4 4 7 56 56 46 6 7 58 56 46 5 6 6 7 3 4 4 1 4 3 2 3 4 4 1 4 3 2 Integral Image for Window Matching Old inefficient algorithm: for each pixel p swap for every disparity d compute cost between window around p in the left image and the same window shifted by d in the right image pick d corresponding to the best matching window New efficient algorithm: for each disparity d use integral image for every pixel p compute cost between window around p in the left image and the same window shifted by d in the right image pick d corresponding to the best matching window Integral Image for Window Matching Suppose current disparity is d = 1 left image right image 3 5 4 4 2 4 2 3 5 4 4 2 4 2 7 4 1 4 4 2 6 7 4 1 4 4 2 6 2 7 46 46 46 6 7 46 46 46 3 6 6 7 5 9 46 46 44 9 7 48 46 44 6 4 9 7 4 7 47 47 47 2 4 47 47 47 7 4 2 4 4 7 56 56 46 6 7 58 56 46 5 6 6 7 3 4 4 1 4 3 2 3 4 4 1 4 3 2 Overlay left and right image at disparity 1 Compute AD (absolute difference) between every overlaid pair of pixels Compute SAD in a window for every pixel Integral Image for Window Matching left image right image current 3 5 4 4 2 4 2 3 5 4 4 2 4 2 disparity 7 4 1 4 4 2 6 7 4 1 4 4 2 6 is d = 1 2 7 46 46 46 6 7 46 46 46 3 6 6 7 5 9 46 46 44 9 7 48 46 44 6 4 9 7 4 7 47 47 47 2 4 47 47 47 7 4 2 4 4 7 56 56 46 6 7 58 56 46 5 6 6 7 3 4 4 1 4 3 2 3 4 4 1 4 3 2 AD image for disparity 1 3 53 45 44 24 42 24 2 2 1 0 2 2 2 7 47 14 41 44 24 62 6 3 3 3 0 2 0 2 746 446 446 43 66 76 7 39 0 0 43 0 0 6 6 6 5 948 446 444 46 94 79 7 39 0 2 38 5 0 6 6 4 4 747 447 447 47 24 42 4 40 0 0 40 2 0 7 7 7 4 758 556 546 45 66 76 7 51 0 10 41 0 0 6 6 6 3 4 3 44 14 41 34 23 2 1 0 3 3 1 0 Integral Image for Window Matching left image right image current 3 5 4 4 2 4 2 3 5 4 4 2 4 2 disparity 7 4 1 4 4 2 6 7 4 1 4 4 2 6 is d = 1 2 7 46 46 46 6 7 46 46 46 3 6 6 7 Pad AD 5 9 46 46 44 9 7 48 46 44 6 4 9 7 image 4 7 47 47 47 2 4 47 47 47 7 4 2 4 with zeros 4 7 56 56 46 6 7 58 56 46 5 6 6 7 3 4 4 1 4 3 2 3 4 4 1 4 3 2 AD image for disparity 1 3 53 45 44 24 42 24 2 0 2 1 0 2 2 2 7 47 14 41 44 24 62 6 0 3 3 3 0 2 0 2 746 446 446 43 66 76 7 6 6 6 0 39 0 0 43 0 0 5 948 446 444 46 94 79 7 6 6 4 0 39 0 2 38 5 0 4 747 447 447 47 24 42 4 7 7 7 0 40 0 0 40 2 0 4 758 556 546 45 66 76 7 6 6 6 0 51 0 10 41 0 0 3 4 3 44 14 41 34 23 2 0 1 0 3 3 1 0 Integral Image for Window Matching left image right image current 3 5 4 4 2 4 2 3 5 4 4 2 4 2 disparity 7 4 1 4 4 2 6 7 4 1 4 4 2 6 is d = 1 2 7 46 46 46 6 7 46 46 46 3 6 6 7 5 9 46 46 44 9 7 48 46 44 6 4 9 7 4 7 47 47 47 2 4 47 47 47 7 4 2 4 4 7 56 56 46 6 7 58 56 46 5 6 6 7 3 4 4 1 4 3 2 3 4 4 1 4 3 2 AD image for disparity 1 0 2 1 0 2 2 2 0 3 3 3 0 2 0 0 39 0 0 43 0 0 0 39 0 2 38 5 0 0 40 0 0 40 2 0 0 51 0 10 41 0 0 0 1 0 3 3 1 0 Integral Image for Window Matching left image right image current 3 5 4 4 2 4 2 3 5 4 4 2 4 2 disparity 7 4 1 4 4 2 6 7 4 1 4 4 2 6 is d = 1 2 7 46 46 46 6 7 46 46 46 3 6 6 7 5 9 46 46 44 9 7 48 46 44 6 4 9 7 4 7 47 47 47 2 4 47 47 47 7 4 2 4 4 7 56 56 46 6 7 58 56 46 5 6 6 7 3 4 4 1 4 3 2 3 4 4 1 4 3 2 AD image for disparity 1 0 2 1 0 2 2 2 0 3 3 3 0 2 0 0 39 0 0 43 0 0 0 39 0 2 38 5 0 0 40 0 0 40 2 0 0 51 0 10 41 0 0 0 1 0 3 3 1 0 Integral Image for Window Matching left image right image current 3 5 4 4 2 4 2 3 5 4 4 2 4 2 disparity 7 4 1 4 4 2 6 7 4 1 4 4 2 6 is d = 1 2 7 46 46 46 6 7 46 46 46 3 6 6 7 5 9 46 46 44 9 7 48 46 44 6 4 9 7 4 7 47 47 47 2 4 47 47 47 7 4 2 4 4 7 56 56 46 6 7 58 56 46 5 6 6 7 3 4 4 1 4 3 2 3 4 4 1 4 3 2 AD image for disparity 1 0 2 1 0 2 2 2 0 3 3 3 0 2 0 0 39 0 0 43 0 0 0 39 0 2 38 5 0 0 40 0 0 40 2 0 0 51 0 10 41 0 0 0 1 0 3 3 1 0 Integral Image for Window Matching AD image for disparity 1 Current disparity is 1 0 2 1 0 2 2 2 For each window pixel, have to 0 3 3 3 0 2 0 compute window sums in AD image 0 39 0 0 43 0 0 Apply integral image to AD image 0 39 0 2 38 5 0 0 40 0 0 40 2 0 0 51 0 10 41 0 0 0 1 0 3 3 1 0 Efficient Algorithm for Window Matching AD image for disparity 1 for every pixel p do 2 1 0 2 2 2 bestDisparity[p] = 0 3 3 3 0 4 0 bestWindCost[p] = HUGE 39 0 0 43 1 0 for disparity d = 0, 1,…, maxD do 39 0 2 38 2 0 overlay images at disparity d 40 0 0 40 2 0 compute AD image for disparity d 51 0 10 41 0 0 compute Integral image from AD image 1 0 3 3 1 0 for every pixel p do currentCost = window cost at pixel p, computed from integral image if currentCost < bestWindCost[p] bestWindCost[p] = currentCost bestDisparity[p] = d return bestDisparity Effect of Window size left image right image true disparities bright means larger disparity 3x3 window 7x7 window 15x15 window Effect of Window size: Low Texture Area window cost 250 15x15 200 150 100 50 7x7 3x3 0 left image 0 5 10 15 disparity windows of size 3x3 and 7x7 are too small to have a distinct pattern no clearly best disparity window of size 15x15 is large enough to have a distinct pattern 7 is clearly the best disparity window has to be large enough Effect of Window size: Near Discontinuities 20 window cost 15x15 15 10 7x7 5 3x3 0 left image 0 5 10 15 disparity central pixel (the one we are matching) is the lamp windows of size 3x3 and 7x7 contain mostly the lamp window of size 15x15 contains mostly the wall we match the wall instead of the lamp! window must be small enough to contain mostly the same object as the central pixel Stereo reconstruction pipeline Steps – Calibrate cameras – Rectify images – Compute disparity – Estimate depth What will cause errors? Camera calibration errors Poor image resolution Occlusions Violations of brightness constancy (specular reflections) Large motions Low-contrast image regions Questions?