Scale-Space Theory & Image Feature Detection PDF
Document Details
Uploaded by BuoyantGenre
University of Twente
2021
Ferdi van der Heijden
Tags
Summary
This document covers scale-space theory, image feature detection and visualization. It introduces the concept of scale in image analysis and provides information on calculating image derivatives, edge detection, line detection, and multi-scale analysis using low-pass filtering with Gaussian functions.
Full Transcript
Scale-space Theory & Image Feature Detection Segmentation & Visualization Ferdi van der Heijden University of Twente - RAM 12/28/2021 Contents 1 INTRODUCTION..................................................................................................................
Scale-space Theory & Image Feature Detection Segmentation & Visualization Ferdi van der Heijden University of Twente - RAM 12/28/2021 Contents 1 INTRODUCTION........................................................................................................................................... 1 2 SCALE-SPACE............................................................................................................................................ 1 2.1 THE SCALE-SPACE STACK.................................................................................................................... 3 2.2 CALCULATION OF THE SCALE-SPACE STACK............................................................................................ 4 2.3 IMPLEMENTATION DETAILS.................................................................................................................. 5 3 IMAGE DERIVATIVES.................................................................................................................................... 5 3.1 CALCULATION OF IMAGE DERIVATIVES................................................................................................... 6 3.1.1 FIRST ORDER PARTIAL DERIVATIVES.................................................................................................. 7 3.1.2 HIGHER ORDER PARTIAL DERIVATIVES............................................................................................... 8 3.1.3 IMPLEMENTATION DETAILS.............................................................................................................. 8 3.2 DIRECTIONAL DERIVATIVES.................................................................................................................. 9 3.3 ROTATION INVARIANTS........................................................................................................................ 9 3.3.1 GRADIENT MAGNITUDE................................................................................................................ 10 3.3.2 THE LAPLACIAN.......................................................................................................................... 12 3.3.3 IMPLEMENTATION DETAILS........................................................................................................... 12 4 EDGE DETECTION.................................................................................................................................... 13 4.1 1D EDGE DETECTION...................................................................................................................... 13 4.2 GENERALIZATION FROM 1D TO 2D EDGE DETECTION........................................................................... 14 4.2.1 MARR-HILDRETH OPERATION....................................................................................................... 14 4.2.2 CANNY EDGE DETECTOR.............................................................................................................. 15 4.2.3 INFLUENCE OF THE SCALE............................................................................................................ 16 4.2.4 IMPLEMENTATION DETAILS........................................................................................................... 16 5 LINE DETECTION...................................................................................................................................... 18 5.1 TYPIFICATION OF A LINE WITH THE HESSIAN MATRIX............................................................................. 18 5.2 SECOND ORDER GAUGE COORDINATES............................................................................................... 19 5.3 DETECTION AND LOCALIZATION OF LINE ELEMENTS.............................................................................. 20 5.4 IMPLEMENTATION DETAILS............................................................................................................... 21 1 Scale-space Theory & Image Feature Detection Segmentation & Visualization 1 Introduction Figure 1. Scale-space theory. In geography, the scale of a map is the ratio between a What you see in an image depends on the scale at which distance on the map and the corresponding distance on the you are looking. ground. Scale-space theory is a subfield in image science. The present syllabus is an introduction to these topics. Here, a different definition of the concept of scale is used. This illustrated in Figure 1. Gaussian low-pass filtering has 2 Scale-space been applied to an input image. The resulting image is The input image in Figure 1 was manipulated to enhance again low-pass filtered resulting in a second output image. the effect of scaling. Normal portrait photographs do not It can be seen that the application of these low-pass filters change so much when visualized at different scales. seems to change the visual information that the input Nonetheless, the premise about the scale is correct. Figure image carries. In scale-space theory, they say that the scale 2 shows four non-manipulated photographs of a quilt. The has been changed in these two steps. pictures were taken at different distances between the Figure 1 showcases the fundamental premise in scale- camera and the centre of the quilt. The settings of the space theory: camera were not changed. In picture 2a the individual What you see in an image depends on woollen yarns are observed. In picture 2d, rectangular the scale at which you are looking. shapes are seen. In the figure, the view on the portrait at the original scale Due to the perspective projection and the smaller object differs much from the view at a larger scale. distance, picture 2a shows more details of the quilt than picture 2d. Clearly, the scales of the pictures differ, but the Scale-space theory contributes immensely to the field of question is how to quantify this? The definition of scale, as image analysis. It provides a solid fundament for image used in geographical maps, is not applicable. This is feature extraction: because the quilt is a curved 3D surface. Therefore, one image derivatives, cannot state that, for instance, 1 pixel in the image edge and line detection, corresponds to x mm on the quilt. keypoint detection, higher order features, such as isophote curvature. A different definition of scale is needed. This definition is In addition, scale-space theory unlocks multi-scale analysis: provided in scale-space theory. The theory is highly inspired feature extraction at several scales. on biological visual systems. Seminal and ground-breaking Figure 2. Photographs of a crocheted quilt taken at distances of 50 cm, 75 cm, 125 cm, and 200 cm. 2 2. Starting from an original image, we can artificially increase its scale by processing it. By doing this, the image resolution increases and the image contains less finer structures. Thus, the original image can be transformed into a series of new images with increasing scales. 3. The operation to fulfil this transformation is low-pass filtering based on convolution. 4. The PSF that is applied in the convolution to increase the scale is the Gaussian function. This function is most consistent in different aspects. The size (spatial extent) of the Gaussian, is parameterized by its width σ. 5. The scale is defined by the effective area of this PSF. Thus: scale = σ 2. 6. If sizes of structures of objects in the image are not known priorly, the image should be looked at not only at the smallest scale, but through a range of scales with varying sizes. In other words, the analysis should take place on the set of images with increasing scales: multi-scale analysis. Figure 3. Scaling an image by increasing the aperture The correctness of statement 3 is illustrated in Figure 3. through which the image is looked at. The figure shows two images of the object ‘quilt’. The first a) Original image taken at a distance Z=50 cm, looked image is acquired at an object distance of Z1 = 50 cm , the through the same aperture size as image b). second at a distance of Z 2 = 200 cm. The objective is to b) Original image taken at a distance Z=200 cm. bring the first image at the same scale as the one in the work has been provided by David Marr (1945-1980) and second image. Just resampling the first image in the ratio Jan Koenderink (1943-). of the object distances of the two images, i.e., every other Z 2 Z1 = 4 pixel, is not satisfying. It results in a pixelated Image resolution is the minimal distance between image image and it has no good resemblance with the second structures that can be resolved. It is tempting to think that image. When first convolved with a Gaussian PSF no this distance equals the size of a pixel, but this is not pixilation shows up, and the resemblance is much better. generally true. Image formation always requires an aperture function, i.e., the PSF with which physically the image is The choice of the Gaussian PSF for the aperture function is acquired. An example in a digital camera is the PSF of a motivated by a couple of arguments 1. We mention just a lens combined with the finite area of the image sensors in a few: CCD circuit This combination provides an effective aperture In biological visual systems, the PSFs, measured at the function. It is the size of this aperture function which retina, seem to approximate derivatives of Gaussian determines the image resolution. functions with different width. Gaussian functions are continuously differentiable to Scale-space theory applies an axiomatic definition for the any order. concept of scale. Leaving out all the mathematical details, Gaussian functions have no zeros; neither in the some of the presumptions, axioms and consequences in spatial domain, nor in the frequency domain. this theory are: 1. The original image has the smallest scale. By definition, Because these arguments are too mathematically involved, this scale is set to zero. the specifics will be omitted here. However, there is one 1ter Haar Romeny: Front-end Vision and Multi-Scale Image Analysis. Kluwer Academic Publishers, ISBN: 1402015038, 2003. 3 σ= ∆s. This follows immediately if the convolution operation is considered in the frequency domain. The Gaussian function and its Fourier transform are given by: x2 + y2 1 − h ( x, y , σ ) = e 2σ 2 2πσ 2 (5) −2πσ 2 ( u 2 + v 2 ) H (u , v, σ ) = e In the frequency domain, convolution becomes Figure 4. An image at three different scales. The scaling multiplication. The equivalence of eq (4) is: operation must be additive: ∆sac = ∆sab + ∆sbc. H sc (u , v, ∆s= ac ) H sc (u , v, ∆sab ) H sc (u, v, ∆sbc ) (6) strong argument which we will discuss in more detail. u , v, ∆s ) F {hsc ( x, y, ∆s )}. Substitution of (5) where H sc (= Figure 4 illustrates this argument. with σ 2 = ∆s yields: The figure shows an image at three different scales. Image e −2π∆sac (u + v2 ) = e −2π∆sab (u + v 2 ) −2π∆sbc ( u 2 + v 2 ) 2 2 e a is the original image and is associated with a scale s = 0. (7) −2π ( ∆sab +∆sbc )( u 2 + v 2 ) The other images have scales 4 and 16, respectively. =e Scaling image a up to the scale of image b involves an increment of scale of ∆sab. Similar notations are used for This equation holds true if and only if ∆sac = ∆sab + ∆sbc. The the other increments. A natural requirement is that scaling conclusion is that hsc ( x, y, ∆s ) = h( x, y, σ ) with σ = ∆s. in two steps, e.g., from a to b, and then from b to c, must give the same result as scaling directly from a to c. 2.1 The scale-space stack Moreover, the scaling parameters should be additive. The A scale-space stack is a 3D data cube. See Figure 5. Two requirement is that ∆sac = ∆sab + ∆sbc. dimensions are the 2D coordinates of the image plane, e.g., x and y. The third dimension is the scale. The Suppose that the original image is denoted by f ( x, y ) , and previous discussion justifies to use s = σ 2 as the definition that the image at a scale of s is denoted by f ( x, y, s ). The original image is at scale s = 0 , so that f ( x, y, 0) = f ( x, y ). To scale the image, a convolution with a PSF hsc ( x, y, ∆s ) is needed. ∆s is the increment of scale that this PSF accomplishes. = f ( x, y, s + ∆s ) f ( x, y, s ) * hsc ( x, y, ∆s ) (1) If we go from image a to b, and then from b to c, the operation becomes: f ( x, = y, sc ) f ( x, y, sa ) * hsc ( x, y, ∆sab ) * hsc ( x, y, ∆sbc ) (2) If the scaling is done in one step from a to c, we have: = f ( x, y, sc ) f ( x, y, sa ) * hsc ( x, y, ∆sac ) (3) Comparison of eq (2) and (3) yields: hsc ( x, y, ∆s= ac ) hsc ( x, y, ∆sab ) * hsc ( x, y, ∆sbc ) (4) Not many PSFs have the property that convolving them with themselves results in a scaled copy of themselves. The Figure 5. A scale-space stack. Gaussian PSF does satisfy this condition provided that σ 0 = 0.5 , ∆σ =0.3 , and K = 10 4 of scale. Starting with the image with zero scale, i.e., the original image, the stack is built by repeatedly applying Gaussian filtering to the original image: f ( x, y , s ) f ( x, y ) * h( x, y, σ ) with σ = s > 0 (8) In practice, the scale-axis needs to be sampled. Usually, this is done in an exponential growing sequence: k ∆s =sk s= 0e with k 0,1, 2, (9) s0 is the minimal scale. ∆s is the exponential step size. Figure 6. 2D Gaussian PSF The reason for using an exponential sequence is that the concept of scale is perceived logarithmically. That is, a h ( x, y , a 2 + b 2 ) = h ( x, y , a ) * h ( x, y , b ) scale increment from 1 to 2 is perceived as large as a scale increment from 10 to 20. This leads to the following recursive implementation: = Since σ = s , the scale-space stack can also be generated f ( x, y, σ k ) f ( x, y, σ k −1 ) * h( x, y, σ k −1 e 2 ∆σ − 1) (11) with σ as the running variable instead of s. We then have: The benefit of using the recursive approach of eq (11) = σ k σ 0e k ∆σ with=σ0 s0 and ∆σ = 2 ∆s 1 (10) above the direct implementation follows from comparing the widths in both cases. In the direction approach, k ∆σ See Figure 5. Note that at a small scale the finest σ k σ= = 0e σ k −1e ∆σ. If the exponential step is small, then structures (coloured triangles, etc) of the quilt are well e ∆σ ≈ 1 + ∆σ. Therefore, σ k ≈ σ k −1 (1 + ∆σ ). In the recursive observable. At a larger scale the larger structures are approach, convolution is done with a Gaussian with a width visible. of σ k −1 e 2 ∆σ − 1 ≈ σ k −1 2∆σ. The computational gain is the ratio 2∆σ (1 + ∆σ ). This gain is large when ∆σ is small. As pointed out above, s is the correct mathematical definition of scale. However, in daily language, σ is also Down-sampling often called the scale of the image. This is rationalized by The number of a computations in a convolution is the observation that, on a logarithmic standard, σ and s proportional to the number of pixels in the image. Images are proportional: log σ = 12 log s. For this reason, most often at a high level in the stack are heavily low-pass filtered and the third dimension in a scale-space stack is the width σ. don’t have high frequency components. These images can be down-sampled without loss of information. This can be 2.2 Calculation of the scale-space stack done, for instance, at each level at which a doubling of σ has been reached so that the image then can be down- In the continuous domain, the Gaussian PSF is a smooth sampled by a factor two. Figure 7 provides an example. function that is continuously differentiable. See Figure 6. In the discrete case, a direct implementation of the Separability convolution, as expressed in eq (6), is very computationally The Gaussian PSF s separable in x and y , expensive, especially if σ is large. There are three x2 y2 1 − 2 1 − 2 remedies: h ( x, y , σ ) = e 2σ e 2σ 2πσ 2πσ (12) A recursive implementation of building the stack. Down-sampling the image at larger scales in the stack. = h ( x, σ ) h ( y , σ ) Using the separability property of the Gaussian PSF. This separability property can be used to implement the 2D Recursive implementation convolution integral as a cascade of two 1D convolutions: At the n -th scale level in the stack, the original image is f ( x, y , σ ) = f ( x, y ) * h ( x, y , σ ) convolved with h( x, y, σ n ). Alternatively, this level can be (13) = ( f ( x, y ) * h ( x, σ ) ) * h ( y , σ ) reached by convolving the image at the (n − 1) -th level. This is based on the property of a Gaussian that The number of a computations in the convolution is now proportional to σ instead of to σ 2. 5 Figure 8. Sampled Gaussians The second discretization error is due to aliasing. In Figure 8, with σ = 0.75 , the Gaussian is so peaked that virtually only three samples are nonzero. This is hardly enough to represent a Gaussian as, without knowing that the samples represent a Gaussian, it is not possible to accurately reconstruct the Gaussian from these three samples. An exact quantification of this so-called aliasing error requires Fourier analysis. Since this is mathematically involved, the analysis will be omitted here. However, the conclusion of the analysis is that widths smaller than 0.7 must be avoided. The error is small but noticeable if σ > 0.7 and becomes negligible if σ ≥ 1. Figure 7. A pyramidal scale-space stack. σ 0 = 1 , ∆σ =0.14 , and K = 16 2.3 Implementation details The fundamental operation for creating a scale-space stack Discretization errors is Gaussian filtering. In MATLAB, this is implemented in the In the discrete case, the convolution integral becomes a function imgaussfilt(). In Python, the OpenCv package convolutional sum 2: holds the function cv2.GaussianBlur(). The following code snippet demonstrates its use. +J +I f (n, m, σ k ) ∑ h( j , σ k k ) ∑ h(i, σ ) f (n − i, m − j ) (14) sigma = 3.0 j= −J i= −I K = int(2*np.ceil(4*sigma)+1) Iout = cv2.GaussianBlur(Iin, (K,K), sigma, Figure 8 shows sampled versions of the Gauss function borderType=cv2.BORDER_REPLICATE) hi (σ ) with three different σ ’s. There are two types of discretization errors: truncation of the tail, and aliasing 3 Image derivatives error. From the early beginnings of the development of image analysis technology on, the derivatives in images have been The 1D Gaussian decays slowly to zero in his tails. To have found of interest. Figure 9 outlines the general idea. The a PSF matrix with a finite number of elements, the tails original image is a CT image in which various anatomical must be truncated. The missing part of the tail induces an entities are visible. Each entity has its own range of grey error. This tail is very close to zero if i > 4σ. Therefore, the levels. The transition from one entity is characterized by a truncation error can be ignored if the I is at least larger steplike change of grey level. This is shown in the plot of than 4σ. An appropriate choice for the number of Figure 9d which is an intensity plot along a horizontal cross- elements in the PSF matrix is 2 4σ + 1 , where 4σ section. The goal is to detect the locations of the rounds 4σ up to the nearest integer, i.e., the ceil() transitions. This is called edge detection. Often, the first function. step in edge detection is to calculate the derivatives in the x- and y-direction. 2The used notation is context-dependent. In the notation " f ( x, y ) ", it is understood that x and y are the coordinates in the continuous spatial domain. In the notation " f (n, m) ", the running variables are the row and column indices in the discrete domain. 6 Figure 9. The x-derivative f x ( x, y ) of an image f ( x, y ). a) and b) Original. c) The x-derivative. d) A cross-section, and e) its derivative. Figure 10. Higher order partial derivatives. f xy ( x, y ) is called the 2nd order cross derivative. Note that f xy ( x, y ) = f yx ( x, y ) because the order of differentiation In the continuous domain, differentiation is a well-defined does not matter. operation. For instance, the first derivative in the x-direction of an image f ( x, y ) is defined as follows: Higher order derivatives are defined in similar ways. Figure 10 provides examples. ∂ def f ( x + ∆ , y ) − f ( x, y ) = f x ( x, y ) = f ( x, y ) lim (15) ∂x ∆→ 0 ∆ 3.1 Calculation of image derivatives This is called the first partial derivative of f ( x, y ) with In digital signal processing, a simple way to approximate respect to x. The presumption here is that the other the first derivative of a 1D signal f (n) is the so-called running variable y is temporary kept constant. In Figure 9d forward difference f x (n)= f (n + 1) − f (n). This and e, this is shown for a horizontal cross-section of the corresponds well to eq (15) if we suppose that ∆ =1. A few image, in which y is kept constant. That is, y = y0. observations can be made: In the continuous case in eq (15), the increment ∆ It can be seen in Figure 9c, that indeed the steplike approaches zero. The derivative only depends on an transitions are highlighted. However, this only works for infinitesimal small neighbourhood around the point x. vertically oriented boundaries because the operator only In the forward difference, the increment is ∆ =1. works in the x-direction. The first derivative f y ( x, y ) in the The forward difference is supposed to be the derivative y-direction is also needed. of f (n) , but actually f x (n) corresponds to the derivative of f (n + 12 ). There is a shift over half the Partial derivatives of higher order can also be considered. sampling period. The second derivatives are: The backward difference is f (n) − f (n − 1). This holds ∂ the same disadvantage as the forward difference. f xx ( x, y ) = f x ( x, y ) ∂x There is a shift of minus half the sampling period. ∂ f yy ( x, y ) = f y ( x, y ) (16) To counteract the shifts, one could consider the ∂y ∂ symmetric difference 12 ( f (n + 1) − f (n − 1) ). In this f xy ( x, y ) = f y ( x, y ) case, ∆ =2. ∂x The conclusion is that differentiation is a neighbourhood operation. In the continuous, analytical case, the 7 neighbourhood is infinitesimal small, but in the discrete a) case the neighbourhood must have a finite size. When using differences, the choice of the size of this neighbourhood, that is, whether ∆ = +1 , ∆ = −1 , or ∆ =2 , seems to be quite arbitrary. The fact that the difference operation is a neighbourhood operation, which is linear and shift invariant, suggests that it can be implemented as convolution. Indeed, if we define: h forw = +1 −1 0 b) hback= 0 +1 −1 (17) h symm = +1 0 −1 the difference operation is f x = f * h forw , etc. We silently introduced the concise notation f = f (n, m) , f x = f x (n, m) , and h forw = h forw (n, m) , and so on. An important disadvantage of the 1D difference operators is their noise sensitivity. This can be seen back in Figure 9d. This sensitivity is counteracted by combining the operator with low-pass filtering. Table 1 gives an overview Figure 11. First order partial derivatives in scale-space. a) 1st order derivative hx (n, m, σ ) with σ = 3. of two different 2D PSFs that implement first order b) 1st order image derivatives in the x-direction at three differences in an image in the x- and y-direction, combined different scales. with low-pass filtering. in the orthogonal direction. 3.1.1 First order partial derivatives Sobel proposed to do this low-pass filtering with a 1× 3 The criticism of the approaches based on differences is filter hlow T = +1 +2 +1. The full operation becomes a that the design choices of the operators are quite arbitrary. cascade of two convolutions f x = f * h symm * hlow T. This is The operators discussed so far implicitly change the scale equivalent with one convolution with h x = h symm * hlowT of the image, but in an uncontrolled way. It is not exactly shown in Table 1. known to what extent the scale is changed. Prewitt used hlow T = +1 +1 +1 as low-pass filter. By In contrast, scale space theory provides the means to doing so, the operator can easily be generalized to, for control the scale in a well-defined way. As shown in the instance, a 5 × 5 operator for improved noise suppression. previous section, changing the scale of the image is done The latter option increases the neighbourhood size. with the convolution f (σ ) = f * h(σ ) where f is the original image, and h(σ ) is the Gaussian PSF with width (scale) σ. Differentiation in, for instance, the x-direction is done by: Table 1 2D difference operators ∂ ∂ ∂ Sobel (1968): f x (σ ) = = f (σ ) ( f *= h(σ ) ) f * = h(σ ) f * h x (σ ) ∂x ∂x ∂x +1 0 −1 +1 +2 +1 (18) hx = +2 0 −2 hy = 0 0 0 +1 0 −1 −1 −2 −1 Thus, differentiation is done by convolving the original image with the derivative of the Gaussian PSF. Since this Prewitt (1970): Gaussian is analytically known, it can be differentiated in +1 0 −1 +1 +1 +1 the continuous domain after which sampling follows: hx = +1 0 −1 hy = 0 0 0 +1 0 −1 −1 −1 −1 8 n − n2 1 − m2 a) hx (n, m, σ ) = − e 2σ 2 e 2σ 2 (19) σ 3 2π σ 2π Figure 11a is a 2D graph of this function. Figure 11b shows the horizontal component of the PSF for three different values of σ. The resulting images are also shown. Note that the version with σ = 0.75 is under-sampled and comes close to the Sobel operator. 3.1.2 Higher order partial derivatives b) Higher order derivatives of an image are obtained in similar way. For instance, the second order derivative in the y- direction follows from: ∂2 ∂2 ∂2 f yy (σ ) = = f (σ ) ( f=* h (σ ) ) f * = h(σ ) f * h yy (σ ) ∂y 2 ∂y 2 ∂y 2 (20) Differentiation in the continuous domain and successive sampling gives: Figure 12. Second order partial derivatives in scale-space. hyy (n, m, σ ) = (m 2 −σ 2 ) e − m2 2σ 2 1 e − n2 2σ 2 (21) a) 2nd order derivative hyy (n, m, σ ) with σ = 3. b) 2nd order image derivatives in the y-direction at three 2π σ 5 2π σ different scales. Figure 12 presents an example. fy = ut_gauss(f,sigma,0,1); % f*hy(sigma) fxx = ut_gauss(f,sigma,2,0); % f*hxx(sigma) 3.1.3 Implementation details fyy = ut_gauss(f,sigma,0,2); % f*hyy(sigma) fxy = ut_gauss(f,sigma,1,1); % f*hxy(sigma) Neither MATLAB, nor Python/OpenCv provide direct implementations for image derivatives in the scale-space. The Python code for Gaussian differentiation is more Functions are offered for low-pass filtering with a Gaussian involved PSF, but functions for filtering with the derivative of a #%% differentiation in the scale-space sigma = 1.0 # set scale Gaussian are not available. An approximate solution is to K = int(2*np.ceil(4*sigma)+1) use the difference operations. For instance, in the x- fg = cv2.GaussianBlur(f, (K,K), sigma, direction the approximation is: borderType=cv2.BORDER_REPLICATE) dx = np.array([[-0.5, 0, 0.5]]) dy = np.transpose(dx) f x (σ ) ≈ f * h(σ ) * + 12 0 − 12 fx = cv2.filter2D(fg,-1, dx, (22) f xx (σ ) ≈ f x (σ ) * + 12 0 − 12 borderType=cv2.BORDER_REPLICATE) A MATLAB code snippet which realizes this is: sigma = 1; % set scale Homework 3_1: fg = imgaussfilt(f,sigma); % f*h(sigma) In MATLAB, create and properly initialize a script H3_1.m. [fx,fy] = imgradientxy(fg,'central'); Read the image test_image.png, Cast it to double type. [fxx,fxy] = imgradientxy(fx,'central'); 1. Use the function ut_gauss() to calculate the first The homemade function ut_gauss() is a much easier to derivative imx of the test image. Use σ = 1. use MATLAB function, which directly implements 2. Extract the cross-section imx(376,376:1068) and plot convolution with the derivatives of a Gaussian. Example this together with a plot of the same cross-section from code: the original image. sigma = 1; % set scale fg = ut_gauss(f,sigma); % f*h(sigma) 3. Repeat 1 and 2 with σ = 3. fx = ut_gauss(f,sigma,1,0); % f*hx(sigma) 4. Discussion: observe remarkable things and explain. 9 Homework 3_2: Figure 13 shows examples. Note that the derivative along Create a Python file H3_2.py. Read the image an edge vanishes. test_image.png, Cast it to double type. 1. Calculate the second derivative imxx of the test image. Directional derivatives are calculated efficiently by using Use σ = 1. the partial derivatives. For the first order directional 2. Create the plot of the cross-section as in H3_1.m. derivate, this is done as follows: 3. Repeat 1 with σ = 3. Add the cross-section to the plot. d 4. Discussion: observe remarkable things and explain f n′( x0 , y0 ) = f ( x0 + t cos α , y0 + t sin α ) dt ∂ ∂x ∂ ∂y 3.2 Directional derivatives = f ( x(t ), y (t ) ) + f ( x(t ), y (t ) ) (27) ∂x ∂t ∂y ∂t The derivatives f x ( x, y ) , f y ( x, y ) , f xx ( x, y ) etc are taken = f x ( x0 , y0 ) cos α + f y ( x0 , y0 ) sin α along the direction of the coordinates. Derivatives can also be defined in an arbitrary direction. This equation is valid for the single point ( x0 , y0 ) for which the direction n and α was defined. However, it is valid for Let’s associate to each location ( x0 , y0 ) in the image plane, each point on which a direction is defined. If we assume a unit vector n that defines a direction at that point: that n and α is available for any point, we can write: cos α f n′( x, y ) f x ( x, y ) cos α + f y ( x, y ) sin α = (28) n= (23) sin α Similar to eq (27), the second order directional derivative is α is the angle between the vector n and the x-axis. The calculated by: line that passes ( x0 , y0 ) in the direction of n has the following parametrization: f n′′( x, y ) = f xx ( x, y ) cos 2α + (29) + f yy ( x, y ) sin 2α + 2 f xy ( x, y ) cos α sin α ) x0 + t cos α x(t= (24) ) y0 + t sin α y (t= In Figure 13, the direction n and α are kept constant all over the image plane. This is not a necessity. The direction t is the free running variable, whereas x(t ) and y (t ) are is allowed to vary. In our thoughts, n and α must be now dependent variables. Eq (24) defines a 1D function replaced by n( x, y ) and α ( x, y ). f ( x0 + t cos α , y0 + t sin α ) g (t ) = (25) 3.3 Rotation invariants g (t ) represents the grey levels on a cross section on the In a camera image, the information that this image line given by eq (24). provides about an object does not depend on the orientation of this camera. Likewise, in a 3D medical scan, The first order and second order directional derivatives are the information that this scan provides about the patient given by: does not depend on the orientation of the patient relative d d2 to the scanner. However, the image derivatives, discussed =f n′( x0 , y0 ) = g (t ) f n′′( x0 , y0 ) g (t ) (26) so far, are defined in a coordinate system that is attached dt t =0 dt 2 t =0 to the camera or scanner. They are features of the image, but are depending on the choice of this coordinate system. Changing the orientation of the camera, or changing the orientation of the patient, will change the image derivatives. This is an unwanted property of image features. Features should be intrinsic properties of the object or patient, and should not depend on the particular choice of the coordinate system in which they are given. They should Figure 13. Directional derivatives at a scale of σ = 7. be rotation invariant. a) Original. The arrow indicates the direction. b) First order f n′( x, y ). c) Second order f n′′( x, y ). 10 Figure 14. The gradient of an image at a scale of σ = 5. a) Original. b) Image at a scale of σ = 5. c) Image as a surface landscape plot with contour lines. d) Contours together with the gradient vector field ∇f. e) andf) First order derivatives. g) Gradient magnitude ∇f. Gradient direction ∠∇f in degrees. Suppose that Rθ {.} is a rotation operator, which means The direction of the gradient vector is: that g = Rθ {f } is a rotated version of the image f. The rotation angle is θ. An image operator O{.} is called atan2 ( f y ( x, y ), f x ( x, y ) ) ∠∇f ( x, y ) = (33) rotation invariant if for any θ : atan2(.,.) is the four-quadrant atan() function. Rθ−1 {O {Rθ {f }}} = O {f } (30) Figure 14 illustrates this. Figure 14c shows an image at a In other words, if g = O{f } , then it doesn’t matter whether scale of σ = 5 shown as a 2D height landscape in a 3D f was rotated before applying O{.} provided that the result graph. Also, some contours are shown, i.e., lines of was rotated back after applying O{.}. constant height. In Figure 14d, these contours are plotted together with some gradient vectors. Figure 14g displays Image derivatives are not rotation invariant, but some the gradient magnitude. Figure 14h shows the direction of combinations of them are. Here we discuss the gradient the gradient. The following observations are made: magnitude and the Laplacian. In next sections, others will A gradient vector at a position ( x, y ) is always follow. perpendicular to the contour at ( x, y ). Gradient vectors point in the direction of the steepest 3.3.1 Gradient magnitude ascent of the height landscape. The gradient of a scalar function f ( x, y ) is a vector field The length of the gradient vector is proportional to the that is made up by the first order partial derivatives: steepness of the height landscape. The gradient magnitude is rotation invariant. f x ( x, y ) At a top of a ridge, the orientation of the gradient ∇ f ( x, y ) = (31) f y ( x, y ) shows a stepwise transition of 1800. The gradient and contour are perpendicular to each other ∇ is the so-called Nabla-operator. The gradient magnitude The contour is the line of constant height. Therefore, along is the Euclidean length of this vector: the contour, that is in the direction β of the contour, the ∇ = f ( x, y ) f x2 ( x, y ) + f y2 ( x, y ) (32) 11 directional derivative must be zero. Applying this to eq (28) yields: f x ( x, y ) cos β + f y ( x, y ) sin β = 0 (34) In a concise notation, defining n T = [cos β sin β ] as the direction of the contour, eq (34) becomes n T ∇f ( x, y ) = 0. Here, n T ∇f ( x, y ) is the inner product of n and ∇f ( x, y ). If the inner product is zero, the two vectors must be orthogonal. Figure 15. First order gauge. Q.E.D. a) Original with gradient vectors. b) First order gauge coordinates and the gradient The gradient vector points in the steepest direction magnitude. Green: w , red: v. To understand this behaviour, it is helpful to analyse the Substitution in ∇g (ξ ,η ) and working out the expression, first order directional derivative f n′( x, y ) in eq (28) to see in using the goniometrical rule cos 2 θ + sin 2 θ = 1 , yields: which direction it is maximum. Extrema of f n′( x, y ) are found by differentiation f n′( x, y ) with respect to α and ∇= g (ξ ,η ) gξ2 ( x, y ) + gη2 ( x, y ) equating this to zero: (38) = f x2 ( x, y ) + f y2 ( x, y ) = ∇ f ( x, y ) d f n′( x, y ) = − f x ( x, y ) sin α + f y ( x, y ) cos α = 0 (35) Q.E.D. dα First order gauge coordinates The solution is: The rotation invariance of the gradient magnitude can also =cos α A= f x ( x, y ) sin α A f y ( x, y ) be made plausible from another point of view: calculate the derivatives in a local coordinate system instead of the where A is a constant such that cos 2α + sin 2α =1. global one. In a local coordinate system, each pixel ( x, y ) Defining w T = [cos α sin α ] as the normal vector has its own pair of basis vectors, w ( x, y ) and v ( x, y ). associated with the direction of maximum directional These pairs are orthogonal, w ( x, y ) ⊥ v ( x, y ). A basis derivative, it follows that: vector has unit length w = ( x, y ) = v( x, y ) 1. This local coordinate system is called a gauge. Figure 15 presents an cos α 1 f x ( x, y ) ∇ f ( x, y ) example. Often, the concise notation w and v is used, in =w = = (36) sin α A f y ( x, y ) ∇ f ( x, y ) which the dependency on ( x, y ) is implicit. In the present case, the gauge coordinates are derived Thus, ∇f ( x, y ) = ∇f ( x, y ) w points in the direction with from the gradient vector, i.e., the first order derivatives. maximum derivative. Q.E.D. Therefore, they are called first order gauges. The direction The gradient magnitude is rotation invariant of the gradient vector is the first basis vector w = ∇f ∇f. The proof of this property follows from applying a rotation to The second basis vector must be orthogonal and, thus, is the input image, and see how the gradient depends on this. calculated from rotating w over 900: The rotation is done via a coordinate transformation: 0 +1 v= w (39) = x ξ cos θ + η sin θ −1 0 g (ξ ,η ) = f ( x, y ) with: −ξ sin θ + η cos θ y = The directional derivative in the direction v is f v′( x, y ) = 0. The partial derivatives are: This follows immediately from the fact that v is lying along the contour lines. The directional derivative in the other gξ (ξ ,η ) = f x ( x, y ) cos θ + f y ( x, y ) sin θ direction is: (37) gη (ξ ,η ) = − f x ( x, y ) sin θ + f y ( x, y ) cos θ 2 (∇f )T ∇f f w′ (= x, y ) w= T ∇f = ∇f = ∇f (40) ∇f ∇f 12 The Laplacian of a ridge, e.g., a vessel, is a negative line-like structure. These observations are partly explained by considering a 1D signal with pulses and steps and by seeing how the first and second order derivatives looks like. This is visualized in Figure 17. Figure 16. The Laplacian. a) Original image. b) Laplacian at σ = 2. c) Laplacian at σ = 5. The calculation of the Laplacian in scale-space can be done The red lines are zero-crossing where ∆f = 0. as follows: d) 1D example showing responses to pulses and steps.. ∂2 ∂2 In other words, the gradient magnitude is the directional f (σ ) ∆= f * h(σ ) + 2 f * h(σ ) ∂x 2 ∂y derivative calculated in the direction of the gradient vector. = f * ( h xx (σ ) + h yy (σ ) ) (42) The rotation invariance is apparent since rotating the object = f * ∆h(σ ) in the image will also rotate the gauge being attached to this object. Thus, the derivative f w′ ( x, y ) will not be Working out the expression for ∆h(σ ) and subsequent affected. sampling yield the following discrete PSF: 3.3.2 The Laplacian − m2 + n2 Another rotation invariant feature is the Laplacian ∆f ( x, y ) hlap (n, m, σ ) = 1 2σ π 6 (n 2 + m − 2σ 2 2 )e 2σ 2 (43) defined as: Since this function is only a function of n 2 + m 2 , the PSF is ∆f ( x, y )= f xx ( x, y ) + f yy ( x, y ) (41) rotational symmetric. This also proves that the Laplacian of an image is rotation invariant. Figure 18 is a graphical Figure 16 gives an example. The following observation can representation of the so-called LoG, the Laplacian of a be made: Gaussian. In a region of constant grey levels, the Laplacian is zero. 3.3.3 Implementation details At one side of a steplike transition of grey levels, the MATLAB code for calculating the gradient magnitude and Laplacian is positive. At the other side, it is negative. the Laplacian in scale-space is as follows: At the position of the step transition, the Laplacian fgrad = (fx.^2 + fy.^2).^(1/2); must be zero. In Figure 16, these zero-crossings are fLap = fxx + fyy; indicated by red lines. The Laplacian of a bright blob in the image is a The equivalent Python code is: negative peak. A dark blob becomes a positive peak. fgrad = (fx**2 + fy**2)**(1/2) fLap = fxx + fyy Figure 17. 1D example showing the first order derivative f x ( x, y ) and the Laplacian of f xx ( x, y ) of signal with pulses and steps. 13 height landscape f ( x, y ). As such it is an image feature that is strongest on step-edges. Figure 14g is a graphical representation of the estimated gradient magnitude of the image in Figure 14a.It can be seen that the gradient magnitude is large at locations corresponding to object boundaries. Edge detection has been a topic of research starting in the late 60s and ending in the 90s. The design of an edge detector is model-based. It often consists of 2 stages: I. Optimal design of a 1D step-edge detector. II. Generalization of the 1D case to the 2D case. Figure 18. LoG: Laplacian of a Gauss with σ = 5. 4.1 1D edge detection Homework 3_3: In 1D edge detection, three types of errors can be Copy the file H3_1.m to H3_3.m and extend the script to: distinguished: detection errors, localization errors, and 1. calculate imgrad, the gradient magnitude of the test multiple responses. Detection errors relate to the image and its cross-section as in homework.3_1. Use probabilities of having a missed or a spurious edge. A σ = 1. multiple response occurs when a single step is detected at 2. Repeat with σ = 3. two or more locations. In literature, different criteria have 3. Discussion: observe remarkable things and explain been proposed to quantify these errors. These criteria are Homework 3_4: used in the optimal design of an edge enhancement filter. Copy the file H3_2.py to H3_4.py and extend the script to: The remarkable thing here is that these different criteria all 1. calculate imLap, the Laplacian of the test image and its lead to approximately the same solution: the optimal filter cross-section as in homework 3_2. Use σ = 1. is a convolution with the first derivative of a Gaussian as 2. Repeat 1 with σ = 3. PSF. This fits seamlessly within scale-space theory. 3. Discussion: observe remarkable things and explain The 1D procedure is illustrated in Figure 19. This example shows a sampled signal f (n) with four step transitions. 4 Edge detection The exact locations are indicated with dashed lines. The The capability of estimating the derivatives of an image absolute value of the first derivative f x (n, σ ) , obtained paves the way for edge detection. Step-edges are the via Gaussian filtering, enhances the steps. Thresholding, locations in the image plane where the grey levels show a f x (n, σ ) ≥ T , provides a list of candidate edges. The step-like transition. Edge detection is useful since often threshold should be used above the noise level. In Figure edges correspond to object boundaries and other 19, the candidate edges are indicated with big red markers. characteristic points. The gradient magnitude of the grey From this list of candidates, and for each step, one edge levels at a point ( x, y ) is the slope of the corresponding must be chosen. This process is the localization of the Figure 19. 1D edge detection. Thick red dots: candidate edges. Blue dots: possible edge locations. Blue crosses: detected edges. 14 edges. The most likely locations are at the maximums of 4.2 Generalization from 1D to 2D edge detection f x (n, σ ). This coincides with the zero-crossings of the In order to generalize the 1D design to images, both the second derivative f xx (n, σ ). In the continuous case, these edge enhancement/detection part and the localization part are locations for which f xx ( x) = 0. In the discrete case, must be addressed. The most straightforward and simplest finding zero-crossings is more involved. It will seldom occur generalization of the enhancement/detection part is to that f xx (n, σ ) = 0. Instead, a zero-crossing between n and replace f x (n, σ ) by the gradient magnitude ∇f (σ ). See n + 1 is detected if f xx (n, σ ) f xx (n + 1, σ ) < 0. In Figure 19, Section 3.3.1. The detection itself can be done by the found zero-crossings are indicated with big blue thresholding resulting in a map of candidate edges. Figure markers. The final detected edges are indicated with blue 20b and c present an example. crosses. In Figure 19, the scale at which the derivatives are 4.2.1 Marr-Hildreth operation determined is σ = 2.5. This choice influences the amount The 2D localization problem was first addressed by Marr of noise suppression, but it also affects the detection. In and Hildreth (1979). Being a rotation invariant image the figure, there are two steps which are separated by a feature, the Laplacian of a Gaussian filtered image, i.e., distance of only 3 sample periods. Because σ = 2.5 , the f (σ ) f xx (n, m, σ ) + f yy (n, m, σ ) , seems to be a natural ∆= spatial support of two PSF separated by 3 sample periods generalization of f xx (n, σ ). The Marr-Hildreth operator overlap. This causes a systematic localization error: the considered the locations at which the Laplacian ∆f (σ ) is steps are detected with a separation distance of 5 zero: the so-called zero-crossings. The map of these samples. locations was called the primal sketch. Figure 20e is an example. Figure 20. Marr-Hildreth edge detection at a scale of σ = 1. 15 In their work, Marr and Hildreth considered their primal ∆f (σ ) =f * ∆h(σ ) sketch as the primary data representation from which (Φ + n) * ( h xx (σ ) + h yy (σ ) ) = visual representations at higher abstraction levels should = Φ * h xx (σ ) + n * h xx (σ ) + n * h yy (σ ) be developed. However, Figure 20e clearly shows that the zero-crossings consist of many spurious edges which The term Φ * h yy (σ ) vanished because Φ does not should be suppressed. The reason for these false edges is depend on y. The term n * h xx (σ ) is inevitable because that object surfaces often possess tiny and less tiny we need the differentiation in the x direction. However, the textures and reliefs, which cause subtle variations in the term n * h yy (σ ) is there, but could be avoided as we don't grey levels: object noise. Although having small amplitudes, need differentiation in the y direction in this specific case. this object noise causes many irrelevant zero-crossings of The differentiation in the y direction only increases here the the Laplacian. uncertainty of the edge location. If we only had used To have a useful edge map, the last step in the Marr- f * h xx (σ ) , instead of f * ∆h(σ ) , the result would have Hildreth operation is to accept only zero-crossings which been more accurate. are also candidate edges: the intersection of candidates In this specific case, only differentiation in the x direction is and zero-crossings. Figure 20f gives the resulting edge useful. This is not generally true. If the 2D step-edge was map. rotated, we would have to rotate the direction of differentiation too: we need the gauge coordinates. The 4.2.2 Canny edge detector differentiation must be done across the edge boundary: in Localization of edges the direction of the gradient vector. This is the direction Although the Marr-Hildreth operation can produce nice indicated by: w ( x, y ) = ∇f (σ ) ∇f (σ ). Instead of using edge maps, it has an important shortcoming. This was first the Laplacian, Canny proposed to consider: recognized by Canny in 1984. To understand this the second order directional derivative in the direction of shortcoming, consider a hypothetical 2D step-edge with the the gradient vector: f w′′(n, m, σ ) with w = ∇f (σ ) ∇f (σ ). edge boundary oriented along the y-axis and that is contaminated by additive noise: The calculation of f w′′(n, m, σ ) is done via eq (23) and (29). Let: f ( x, y ) = Φ ( x ) + n ( x, y ) f (σ ) f y (σ ) Φ ( x) is the step function. It does not depend y in this = cos α = x sinα (44) ∇ f (σ ) ∇ f (σ ) specific case. n ( x, y ) is the noise. The Laplacian of this noisy 2D step-edge, in scale-space shorthand notation, is: Then: Figure 21. Canny edge detection. 16 f w′′ (σ ) = f xx (σ ) cos 2α + f yy (σ ) sin 2α + 2f xy (σ ) cos α sin α Canny proposed to use two thresholds, Tlow and Thigh with f xx (σ )f x2 (σ ) + f yy (σ )f y2 (σ ) + 2f xy (σ )f x (σ )f y (σ ) Tlow < Thigh. Application to the gradient magnitude yields two = 2 candidate edge maps which we indicate by e weak and e strong : ∇ f (σ ) Tlow ⇒ e weak : many weak edges: few FN, many FP. (45) Thigh ⇒ e strong : few strong edges: many FN, few FP. Clearly, an edge that is detected in e strong will also be Edges are localized at positions at which f w′′ (σ ) = 0. These detected in e weak. That is: e strong ⊂ e weak. In an edge map, a locations coincide with positions where the numerator is sequence of connected edge elements forms a line zero: segment: a string of connected edge elements. The strategy in hysteresis thresholding is to retain only the line f xx (σ )f x2 (σ ) + f yy (σ )f y2 (σ ) + 2f xy (σ )f x (σ )f y (σ ) = 0 (46) segments in e weak which have at least one edge element in common with e strong. The procedure is illustrated in Figure The result of the Canny edge detector applied to Figure 20a 22. is shown in Figure 21. This result is similar to the Marr- Hildreth result, but close inspection reveals that the Canny 4.2.3 Influence of the scale edge segments are less ragged. Bifurcations are difficult for both methods, but usually Canny performs somewhat The edge detector in the previous sections were executed better than Marr-Hildreth. at the finest scale, σ = 1. This reveals the details of the objects best. For many applications, this is a good choice, Hysteresis thresholding but for a few applications increasing the scale would be The chosen threshold T is an important parameter in edge better. An example is the quilt image in Section 2. detection. If the contrast step heights between regions is large compared with the noise level, the choice is well- In Figure 23, a grey level version of the quilt is shown. To defined. Often, this is not the case, and the choice is critical have large contrast between the colours, the grey level as it determines the trade-off between the number of image is obtained by subtracting the blue channel from the missed true edges (false negatives, FN), and number of red channel. Canny edge detection has been applied at falsely detected spurious edges (false positives, FP). three different scales. At the finest scale, the small triangular primitives of the quilt are detected. At a coarse It is difficult to automatize the selection of the threshold. In scale, the larger rectangular structures made up by the some applications, it helps to define the number of edges smaller triangles are detected. as a fraction of the number of pixels. It there are NM pixels in the image, then one could require that the number 4.2.4 Implementation details of detected edges is γ NM in which γ is a percentage. For Pseudo code for the Marr-Hildreth and Canny edge instance, in Figure 20c, the threshold was determined such detection is as follows: that the percentage was 15%. Figure 22. Hysteresis thresholding. The final map contains only the line segments in the weak edge map, that have at least one edge element in the strong edge map 17 to implement the Marr-Hildreth and the Canny edge detection. However, one need an important subfunction: the detection of zero-crossings. It is not available in standard packages. The implementation can be done with morphological operations, which is not the topic of this lecture note. Homemade functions are available for MATLAB, ut_levelx(), and for Python ut.levelx(). Homework 3_5: Copy the file H3_4.py to H3_5.py. Remove the part of the cross-sections. Read the file aorta.png instead of the test image, and cast it to type double. The image is a CT showing the aorta and some branches along with other abdominal structures. The goal is to segment the image with edge detection such that only the aorta with the first branches is detected. 1. Calculate imLap with σ = 1 as you did before. 2. Calculate also the gradient magnitude imgrad. For that you need first the calculate first order derivatives imx and imy. Figure 23. Canny edge detection at different scales. 3. Using the function ut.levelx(), create a map z of the Input: zero-crossings of the Laplacian. To enable this - grey level image f function, put the file in project folder and add import - scale parameter σ ut_functions as ut in the first section of H3_5.py. - thresholds Tlow , Thigh 4. Calculate the edge feature map imfeat by multiplying 1. Determine the image derivatives f x (σ ) , f x (σ ) , f y (σ ) , imgrad with z. f xx (σ ) , f yy (σ ) , and f xy (σ ). 5. Threshold the image with a single threshold T. In 2. Calculate the gradient magnitude ∇f (σ ). Python is done by: imedge = np.float64(imfeat>T). 3. Calculate 2nd order derivative for edge localization. The cast to doubles is done to facilitate visualization of MH: floc (σ ) =∆f (σ ) = f xx (σ ) + f yy (σ ). the result. For now, choose T = 0.01 Canny:= floc (σ ) f xx (σ )f x (σ ) + 2 6. The procedure depends on two design parameters: σ +f yy (σ )f y2 (σ ) + 2f xy (σ )f x (σ )f y (σ ) and T. Play around with these parameters by 4. Calculate the zero-crossings rerunning the code with different values. Select the = z (σ ) ( floc (σ ) ≡ 0 ) ones that gives the best trade-off between “detection 5. Mask the gradient magnitude with zero-crossings of the aorta and 1st branches” and “veiling the other = f z (σ ) ∇f (σ ) ⋅ z (σ ) structures”. Hint: determine first the minimal diameter 6. Apply (hysteresis) thresholding to f z (σ ). of a vessel that you want to detect. MATLAB provides a function edge() which can perform 7. Save the zeros 1-z to a file..\images\H3_5z.png (the both Marr-Hildreth (option: "log") and Canny edge inverse 1-z provides better visibility of the zeros than detection (option: "canny"). just z). Save also other images to file: imfeat and 1-imedge to H3_5feat.png and H3_5edge.png. Python/OpenCv does not provide Marr-Hildreth edge 8. Discussion: observe remarkable things and explain detection. It has a Canny implementation, which is mediocre as it is based entirely on the Sobel operator Homework 3_6: rather than a Gaussian scale-space implementation. The Repeat homework 3_5, but now in MATLAB using the Canny Python/skimage package contains a more decent Canny operator as implemented in the MATLAB function edge(). implementation. Optimize the 3 parameters σ and the two thresholds. Start with σ = 1 and Th=[0.05 0.1]. Compare the results with Since it is easy to implement image differentiation in the scale-space, and to calculate the gradient magnitude and the Marr-Hildreth approach in Homework 3_5. the Laplacian, see Homework 3_3 and 3_4, it is also easy 18 Figure 24. Line-like image structure. Figure 25. Truncated Taylor series expansion of line-like structures. In order for (0, 0) to be a line element, the parameters a 5 Line detection and b must be near zero, and B must be small compared to A. In Figure 25b, a , b , and B are small, but not zero. Loosely speaking, in Euclidean geometry, a point in a 2D plane is defined as something which has a location in the Equation (48) can be written in vector-matrix notation: plane, but which does not have a size. A line or curve is a chained, ordered sequence of connected points, such that T x 1 x T f xx 0 x f ( x, y ) ≈ f (0, 0) + ∇ f + 2 (49) this sequence has a length without having a width. y y 0 f yy y The image of a point is the PSF of the imaging system. The It i