COMP9517_24T2W9_Motion_Estimation.pdf

COMP9517: Computer Vision Motion Estimation Copyright (C) UNSW COMP9517 24T2W9 Motion Estimation 1 Introduction Adding the time dimension to the image formation Copyright (C) UNSW COMP9517 24T2W9 Motion Estimation 2 Introduction A changing scene may be observed and analysed via a sequence of images Copyright (C) UNSW COMP9517 24T2W9 Motion Estimation 3 Introduction Changes in an image sequence provide features for – Detecting objects that are moving – Computing trajectories of moving objects – Performing motion analysis of moving objects – Recognising objects based on their behaviours – Computing the motion of the viewer in the world – Detecting and recognising activities in a scene Copyright (C) UNSW COMP9517 24T2W9 Motion Estimation 4 Applications Motion-based recognition – Human identification based on gait, automatic object detection Automated surveillance – Monitoring a scene to detect suspicious activities or unlikely events Video indexing – Automatic annotation and retrieval of videos in multimedia databases Human-computer interaction – Gesture recognition, eye gaze tracking for data input to computers Traffic monitoring – Real-time gathering of traffic statistics to direct traffic flow Vehicle navigation – Video-based path planning and obstacle avoidance capabilities Copyright (C) UNSW COMP9517 24T2W9 Motion Estimation 5 Scenarios Still camera Constant background with – Single moving object – Multiple moving objects Moving camera Relatively constant scene with – Coherent scene motion – Single moving object – Multiple moving objects Copyright (C) UNSW COMP9517 24T2W9 Motion Estimation 6 Topics Change detection Using image subtraction to detect changes in scenes Sparse motion estimation Using template matching to estimate local displacements Dense motion estimation Using optical flow to compute a dense motion vector field Copyright (C) UNSW COMP9517 24T2W9 Motion Estimation 7 Change Detection Copyright (C) UNSW COMP9517 24T2W9 Motion Estimation 8 Change Detection Detecting an object moving across a constant background The forward and rear edges of the object advance only a few pixels per frame By subtracting the image It from the previous image It-1 the edges should be evident as the only pixels significantly different from zero Copyright (C) UNSW COMP9517 24T2W9 Motion Estimation 9 Image Subtraction Step: Derive a background image from a set of video frames at the beginning of the video sequence Performance Evaluation of Tracking and Surveillance (PETS) 2009 Benchmark Copyright (C) UNSW COMP9517 24T2W9 Motion Estimation 10 Image Subtraction Step: Subtract the background image from each subsequent frame to create a difference image - Copyright (C) UNSW COMP9517 24T2W9 Motion Estimation 11 Image Subtraction Step: Threshold and enhance the difference image to fuse neighbouring regions and remove noise Copyright (C) UNSW COMP9517 24T2W9 Motion Estimation 12 Image Subtraction Detected bounding boxes overlaid on input frame Copyright (C) UNSW COMP9517 24T2W9 Motion Estimation 13 Change Detection Image subtraction algorithm – Input: images It and It-Δt (or a model image) – Input: an intensity threshold τ – Output: a binary image Iout – Output: a set of bounding boxes B 1. For all pixels [r, c] in the input images, set Iout[r, c] = 1 if (|It[r, c] –It-Δt[r, c]|>τ) set Iout[r, c] = 0 otherwise 2. Perform connected components extraction on Iout 3. Remove small regions in Iout assuming they are noise 4. Perform a closing of Iout using a small disk to fuse neighbouring regions 5. Compute the bounding boxes of all remaining regions of changed pixels 6. Return Iout[r, c] and the bounding boxes B of regions of changed pixels Copyright (C) UNSW COMP9517 24T2W9 Motion Estimation 14 Sparse Motion Estimation Copyright (C) UNSW COMP9517 24T2W9 Motion Estimation 15 Motion Vector A motion field is a 2D array of 2D vectors representing the motion of 3D scene points A motion vector in the image represents the displacement of the image of a moving 3D point – Tail at time t and head at time t+Δt – Instantaneous velocity estimate at time t Zoom out Zoom in Pan Left Copyright (C) UNSW COMP9517 24T2W9 Motion Estimation 16 Sparse Motion Estimation A sparse motion field can be computed by identifying pairs of points that correspond in two images taken at times t and t+Δt Assumption: intensities of interesting points and their neighbours remain nearly constant over time Two steps: – Detect interesting points at t – Search corresponding points at t+Δt Copyright (C) UNSW COMP9517 24T2W9 Motion Estimation 17 Sparse Motion Estimation Detect interesting points – Image filters Canny edge detector Hessian ridge detector Harris corner detector Scale invariant feature transform (SIFT) Fully convolutional neural network (FCN) – Interest operator Computes intensity variance in the vertical, horizontal and diagonal directions Interest point if the minimum of these four variances exceeds a threshold Copyright (C) UNSW COMP9517 24T2W9 Motion Estimation 18 Detect Interesting Points Procedure detect_interesting_points(I,V,w,t) { for (r = 0 to MaxRow – 1) for (c = 0 to MaxCol – 1) if (I[r,c] is a border pixel) break; else if (interest_operator(I,r,c,w) >= t) add (r,c) to set V; } Procedure interest_operator (I,r,c,w) { v1 = variance of intensity of horizontal pixels I[r,c-w]…I[r,c+w]; v2 = variance of intensity of vertical pixels I[r-w,c]…I[r+w,c]; v3 = variance of intensity of diagonal pixels I[r-w,c-w]…I[r+w,c+w]; v4 = variance of intensity of diagonal pixels I[r-w,c+w]…I[r+w,c-w]; return min(v1, v2, v3, v4); } Copyright (C) UNSW COMP9517 24T2W9 Motion Estimation 19 Sparse Motion Estimation Search corresponding points – Given an interesting point Pi from It, take its neighbourhood in It and find the best matching neighbourhood in It+Δt under the assumption that the amount of movement is limited search Pi region Pi Qi best motion vector T matching (enlarged) neighbourhood It Qi It+Δt This approach is also known as template matching Copyright (C) UNSW COMP9517 24T2W9 Motion Estimation 20 Similarity Measures Cross-correlation (to be maximised) CC(∆x , ∆y ) = ∑ I ( x, y ) ⋅ I ( x , y )∈T t t +∆t ( x + ∆x, y + ∆y ) Sum of absolute differences (to be minimised) SAD( = ∆x , ∆y ) ∑ ( x , y )∈T I t ( x, y ) − I t +∆t ( x + ∆x, y + ∆y ) Sum of squared differences (to be minimised) ∑ 2 SSD( = ∆x , ∆y ) I t ( x, y ) − I t +∆t ( x + ∆x, y + ∆y )  ( x , y )∈T Copyright (C) UNSW COMP9517 24T2W9 Motion Estimation 21 Similarity Measures Mutual information (to be maximised)  PAB (a, b)  MI( A, B ) = ∑∑ PAB (a, b) log 2   B a b P  A ( a ) PB (b )  Subimages to compare: PB (b) A ⊂ It B ⊂ It +∆t Intensity probabilities: PA (a ) PB (b) Joint intensity probability: PAB (a, b) A PA (a ) PAB (a, b) Copyright (C) UNSW COMP9517 24T2W9 Motion Estimation http://dx.doi.org/10.1016/j.jbi.2011.04.008 22 Dense Motion Estimation Copyright (C) UNSW COMP9517 24T2W9 Motion Estimation 23 Dense Motion Estimation Assumptions: – The object reflectivity and illumination do not change during the considered time interval – The distance of the object to the camera and the light sources does not vary significantly over this interval – Each small neighbourhood Nt(x,y) at time t is observed in some shifted position Nt+Δt(x+Δx,y+Δy) at time t+Δt These assumptions may not hold tight in reality, but provide useful computation and approximation Copyright (C) UNSW COMP9517 24T2W9 Motion Estimation 24 Spatiotemporal Gradient Taylor series expansion of a function ∂f f (x + ∆=x) f ( x) + ∆x + h.o.t ⇒ ∂x ∂f f ( x + ∆x) ≈ f ( x) + ∆x ∂x Multivariable Taylor series approximation ∂f ∂f ∂f f ( x + ∆x, y + ∆y, t + ∆t ) ≈ f ( x, y, t ) + ∆x + ∆y + ∆t (1) ∂x ∂y ∂t Copyright (C) UNSW COMP9517 24T2W9 Motion Estimation 25 Optical Flow Equation Assuming neighbourhood Nt(x, y) at time t moves over vector V=(Δx, Δy) to an identical neighbourhood Nt+Δt(x+Δx, y+Δy) at time t+Δt leads to the optical flow equation: f ( x + ∆x, y + ∆y, t + ∆t ) = f ( x, y, t ) (2) t t+Δt Copyright (C) UNSW COMP9517 24T2W9 Motion Estimation 26 Optical Flow Computation Combining (1) and (2) yields the following constraint: ∂f ∂f ∂f ∆x + ∆y + ∆t =0 ⇒ ∂x ∂y ∂t ∂f ∆x ∂f ∆y ∂f ∆t + + =0⇒ ∂x ∆t ∂y ∆t ∂t ∆t ∂f ∂f ∂f vx + vy + = 0⇒ ∂x ∂y ∂t ∇f ⋅ v =− f t where v = (vx , vy ) is the velocity or optical flow of f ( x, y , t ) and ∇f = ( fx , fy ) = (∂f / ∂x , ∂f / ∂y ) is the gradient Copyright (C) UNSW COMP9517 24T2W9 Motion Estimation 27 Optical Flow Computation The optical flow equation provides a constraint that can be applied at every pixel position However, the equation does not have unique solution and thus further constraints are required For example, by using the optical flow equation for a group of adjacent pixels and assuming that all of them have the same velocity, the optical flow computation task amounts to solving a linear system of equations using the least-squares method Many other solutions have been proposed (see references) Copyright (C) UNSW COMP9517 24T2W9 Motion Estimation 28 Optical Flow Computation Example: Lucas-Kanade approach to optical flow Assume the optical flow equation holds for all pixels 𝑝𝑝𝑖𝑖 in a certain neighbourhood and use the following notation: ∂f ∂f ∂f v = (vx , vy ) fx = f y = ft = ∂x ∂y ∂t Then we have the following set of equations: fx ( p1 )vx + f y ( p1 )v y = − ft ( p1 ) fx ( p2 )vx + f y ( p2 )v y = − ft ( p2 )    fx ( p N ) v x + f y ( p N ) v y = − f t ( pN ) Copyright (C) UNSW COMP9517 24T2W9 Motion Estimation 29 Optical Flow Computation Example: Lucas-Kanade approach to optical flow The set of equations can be rewritten as Av = b where  fx ( p1 ) fy ( p1 )   − ft ( p1 )   f (p ) fy ( p2 )  − f (p )  vx  A= b= t 2  x 2 v=      vy          fx ( pN ) fy ( pN )  − f (  t N  p ) This can be solved using the least-squares approach: v = (A A ) A b −1 T T A Av = A b T ⇒ T Copyright (C) UNSW COMP9517 24T2W9 Motion Estimation 30 Optical Flow Example https://www.youtube.com/watch?v=GIUDAZLfYhY Copyright (C) UNSW COMP9517 24T2W9 Motion Estimation 31 References and Acknowledgements Chapter 8 of Szeliski 2010 Chapter 9 of Shapiro and Stockman 2001 Some images drawn from the above references Copyright (C) UNSW COMP9517 24T2W9 Motion Estimation 32 Example exam question Which one of the following statements about motion analysis is incorrect? A. Detection of moving objects by subtraction of successive images in a video works best if the background is constant. B. Sparse motion estimation in a video can be done by template matching and minimising the mutual information measure. C. Dense motion estimation using optical flow assumes that each small neighbourhood remains constant over time. D. Optical flow provides an equation for each pixel but requires further constraints to solve the equation uniquely. Copyright (C) UNSW COMP9517 24T2W9 Motion Estimation 33

COMP9517_24T2W9_Motion_Estimation.pdf

Document Details

Tags

Related

Full Transcript

Upgrade to continue