1-Segmentation.pdf
Document Details
Uploaded by YoungPrehnite
Faculty of Electronic Engineering
Tags
Full Transcript
بسم اهلل الرمحن الرحيم Segmentation Prof. Mohamed Berbar Segmentation CSC558 By Dr Mohamed Berbar 2 In the analysis of the objects in images it is essential that we can distinguish between the objects of interest and background. The techniques t...
بسم اهلل الرمحن الرحيم Segmentation Prof. Mohamed Berbar Segmentation CSC558 By Dr Mohamed Berbar 2 In the analysis of the objects in images it is essential that we can distinguish between the objects of interest and background. The techniques that are used to find the objects of interest are usually referred to as segmentation techniques - segmenting the foreground from background. The most common techniques--thresholding and edge finding-- Computer Vision By Dr Mohamed Berbar 3 Note: ❖Segmentation depends on an application, its semantics. ❖There is no universally applicable segmentation technique that will work for all images, and, ❖No segmentation technique is perfect. Computer Vision By Dr Mohamed Berbar 4 CSC558 By Dr Mohamed Berbar 5 CSC558 By Dr Mohamed Berbar 6 CSC558 By Dr Mohamed Berbar 7 Segmentation Algorithms for Features Extraction Thresholding Histogram-Based Segmentation Techniques Region Based Segmentation Edge-based segmentation Template matching Computer Vision By Dr Mohamed Berbar 8 1-Thresholding Computer Vision By Dr Mohamed Berbar 9 I - Global Thresholding In digital image processing, thresholding is the simplest method of segmenting images Computer Vision By Dr Mohamed Berbar 10 Compute Global threshold from whole image Incorrect in some regions Computer Vision By Dr Mohamed Berbar 11 CSC558 By Dr Mohamed Berbar 12 Thresholding Thresholding is the process of converting grey-level images to binary-images. (a) Image to be thresholded (b) Brightness histogram of the image Computer Vision By Dr Mohamed Berbar 13 Input Image Binarized Image (142) The choice of threshold level have a great effect on the output binary image. Computer Vision By Dr Mohamed Berbar 14 Binarized Image (220) Thresholding is to transform grey/colour image to binary if f(x, y) > T output = 1 else 0 How to find T? Note: Binary images which have two grey values only. Computer Vision By Dr Mohamed Berbar 15 Pros and Cons of Global Thresholding Advantages, including its simplicity and efficiency in determining a single threshold value for the entire image. It is particularly effective in scenarios where the foreground and background regions have distinct intensity distributions. However, global thresholding may not be suitable for images with complex intensity distributions or when there is significant variation in lighting conditions across the image. Additionally, it may not accurately segment objects or regions that have overlapping intensity values. CSC 558 By Dr Mohamed Berbar 16 Otsu's Method Otsu's Method for Automatic Threshold Determination is a widely used technique for automatically determining the optimal threshold value in image segmentation. It calculates the threshold by maximizing the between-class variance of pixel value, which effectively separates foreground and background regions. This method is particularly useful when dealing with images that have bimodal or multimodal intensity distributions, as it can accurately identify the threshold that best separates different objects or regions in the image. CSC 558 By Dr Mohamed Berbar 17 C. Local thresholds Divide image into regions Compute threshold per region Merge thresholds across region boundaries Computer Vision By Dr Mohamed Berbar 18 Local thresholds Local thresholding addresses the limitations of global thresholding by considering smaller regions within the image. It calculates a threshold value for each region based on its local characteristics, such as mean or median intensity. One advantage is that it can handle images with varying lighting conditions or uneven illumination. This is because adaptive thresholding calculates the threshold value locally, taking into account the specific characteristics of each sub-region. Additionally, adaptive thresholding can help preserve important details and fine textures in an image, as it adjusts the threshold value based on the local pixel intensities. CSC 558 By Dr Mohamed Berbar 19 Mean and Gaussian Adaptive Thresholding for local thresholds Two commonly used methods in image processing are Mean and Gaussian Adaptive Thresholding. Mean adaptive thresholding calculates the threshold value for each sub-region by taking the average intensity of all pixels within that region. Gaussian adaptive thresholding uses a weighted average of pixel intensities, giving more importance to pixels closer to the center of the sub-region. These methods are effective in enhancing image quality and improving accuracy in tasks such as object detection or segmentation. 20 CSC558 By Dr Mohamed Berbar 21 Band thresholding The image in Figure with all the pixels except the 8's blanked out Computer Vision By Dr Mohamed Berbar 22 Figure with athreshold point of 5 Computer Vision By Dr Mohamed Berbar 23 Real-world applications for thresholding Object Detection: By setting a threshold value, objects can be separated from the background, allowing for more accurate and efficient object detection. Medical Images: Image thresholding can be used to segment different structures or abnormalities for diagnosis and analysis in medical imaging. Quality Control: Image thresholding plays a crucial role in quality control processes, such as inspecting manufactured products for defects or ensuring consistency in color and texture of a color image. Object Segmentation: Image thresholding is also commonly used in computer vision tasks such as object segmentation, where it helps to separate foreground objects from the background. This enables more accurate and efficient detection of objects within an image. By Prof Mohamed Berbar 24 Noise Reduction: Thresholding can be utilized for noise reduction, as it can help to eliminate unwanted artifacts or disturbances in an image. Edge Detection: Image thresholding aids in identifying and highlighting the boundaries between different objects or regions within an image with edge detection algorithms. CSC558 By Dr Mohamed Berbar 25 2. Histogram-Based Segmentation Techniques A. Triangle algorithm B. Histogram peak technique, Computer Vision By Dr Mohamed Berbar 26 Histogram-based segmentation Histogram-based segmentation depends on the histogram of the image. Therefore, you must prepare the image and its histogram before analyzing it. Preprocessing for thresholding 1. Histogram Equalization and 2. Histogram Smoothing Computer Vision By Dr Mohamed Berbar 27 1. The first step is histogram equalization (Phillips, August 1991). The result is an image with better contrast. 2. The next preprocessing step is histogram smoothing. Note: histogram smoothing not image smoothing When examining a histogram, you look at peaks and valleys. Too many tall, thin, peaks and deep valleys will cause problems. Smoothing the histogram removes these spikes and fills in empty canyons while retaining the same basic shape of the histogram. Computer Vision By Dr Mohamed Berbar 28 The result of smoothing the histogram Smoothing a histogram is an easy operation. It replaces each point with the average of it and its two neighbors. So, a healthy histogram will show up as a strong graph which shouldn't appear too "thin" or lacking in data, ideally centered around the middle with no data running off either end of the graph Computer Vision By Dr Mohamed Berbar 29 Normalization of a Histogram Normalize an histogram is a technique consisting into transforming the discrete distribution of intensities into a discrete distribution of probabilities. To do so, we need to divide each value of the histogram by the number of pixels. Because a digital image is a discrete set of values that could be seen as a matrix and it's equivalent to divide each nk by the dimension of the array which is the product of the width by the length of the image. CSC 558 By Dr Mohamed Berbar 30 A - Automated Methods for thresholding A.1. Triangle algorithm Figure: The triangle algorithm is based on finding the value of b that gives the maximum distance d. Computer Vision By Dr Mohamed Berbar 31 Figure: The triangle algorithm is based on finding the value of b that gives the maximum distance d. Triangle algorithm - This technique is illustrated in Figure. A line is constructed between the maximum of the histogram at brightness bmax and the lowest value bmin = (p=0)% in the image. The distance d between the line and the histogram h[b] is computed for all values of b from b = bmin to b = bmax. The brightness value bo where the distance between h[bo] and the line is maximal is the threshold value, that is, = bo. This technique is particularly effective when the object pixels produce a weak peak in the histogram. Computer Vision By Dr Mohamed Berbar 32 B- Histogram Peak Technique The first such technique uses the peaks of the histogram. This technique finds the two peaks in the histogram corresponding to the background and object of the image. It sets the threshold half way between the two peaks. Computer Vision By Dr Mohamed Berbar 33 Figure 5 The result of smoothing the histogram given in Figure 2 Look back at the smoothed histogram in Figure 5 The background peak is at 2 and the object peak is at 7. The midpoint is 4, so the low threshold value is 4 and the high is 9. Computer Vision By Dr Mohamed Berbar 34 CSC558 By Dr Mohamed Berbar 35 Computer Vision By Dr Mohamed Berbar 36 Exercises: 1. What are the advantages/disadvantages of Glopal thresholding and local thresholding. 2. Write a/an program/algorithm to calculate the grey-scale Histogram of an image and store the result in an array Computer Vision By Dr Mohamed Berbar 37 3. Region Based Segmentation All pixels belong to a region object part of object Background Find region constituent pixels using some criteria boundary Region Detection A set of pixels P An homogeneity predicate H(P) Partition P into regions {R}, Computer Vision By Dr Mohamed Berbar 39 Region Growing Algorithms All pixels belong to a region Select a pixel Grow the surrounding region Computer Vision By Dr Mohamed Berbar 40 The difficult task is region growing. The "object" in Figure 6 is a happy face. It comprises three different regions (two eyes and the smile). Region growing takes this image, groups the pixels in each separate region, and gives them unique labels. Figure 7 shows the result of region growing performed on Figure 6. Region growing grouped and labeled one eye as region one, the other eye as region two, and the smile as region three. Computer Vision By Dr Mohamed Berbar 41 Figure 7 The result of region growing performed on Figure 6 Computer Vision By Dr Mohamed Berbar 42 1.The algorithm for region growing. It begins with an image array g comprising zeros and pixels set to a value. 2.The algorithm loops through the image array looking for a 3. g(i,j) == value. 4. When it finds such a pixel, it calls the label_and_check_neighbor routine. label_and_check_neighbor sets the pixel to g_label (the region label) and examines the pixel's eight neighbors. If any of the neighbors equal value, they are pushed onto a stack. 5. When control returns to the main algorithm, each pixel on the stack is popped and sent to label_and_check_neighbor. 6. All the points on the stack equaled value, so you set them and check their neighbors. After setting all the pixels in the first region, you increment g_label and move on looking for the next region. Computer Vision By Dr Mohamed Berbar 43 CSC558 By Dr Mohamed Berbar 44 Split and Merge Split and merge segmentation is an image processing technique used to segment an image. The image is successively split into quadrants based on a homogeneity criterion and similar regions are merged to create the segmented result. The technique incorporates a quadtree data structure, meaning that there is a parent-child node relationship. The total region is a parent, and each of the four splits is a child. Initialise image as a region While region is not homogeneous – split into quadrants and examine homogeneity Computer Vision By Dr Mohamed Berbar 45 Recursive Splitting Algorithm Split(P) { If (!H(P)) {P → subregions 1 … 4; Split (subregion 1); Split (subregion 2); Split (subregion 3); Split (subregion 4); } } Computer Vision By Dr Mohamed Berbar 46 Recursive Merging Algorithm If adjacent regions are – weakly split weak edge – similar similar greyscale/colour properties Merge them Computer Vision By Dr Mohamed Berbar 47 CSC558 By Dr Mohamed Berbar 48 Example CSC558 By Dr Mohamed Berbar 49 CSC558 By Dr Mohamed Berbar 50 CSC558 By Dr Mohamed Berbar 51 CSC558 By Dr Mohamed Berbar 52 CSC558 By Dr Mohamed Berbar 53 CSC558 By Dr Mohamed Berbar 54 CSC558 By Dr Mohamed Berbar 55 CSC558 By Dr Mohamed Berbar 56 4. Edge-based segmentation CSC558 By Dr Mohamed Berbar 57 The most common problems of edge-based segmentation are An edge presence in locations where there is no border, and no edge presence where a real border exists. Borders resulting from the thresholding method are strongly affected by image noise, often with important parts missing Computer Vision By Dr Mohamed Berbar 58 7.4 Edge-based segmentation The most common problems of edge-based segmentation are an edge presence in locations where there is no border, and no edge presence where a real border exists. Borders resulting from the thresholding method are strongly affected by image noise, often with important parts missing Computer Vision By Dr Mohamed Berbar 59 Edge Detection and Following Detection – finds candidate edge pixels Following – links candidates to form boundaries Contour Tracking Scan image to find first edge point Track along edge points Join edge segments Computer Vision By Dr Mohamed Berbar 60 Edge Detection The goal of edge detection is to mark the points in a digital image at which the luminous intensity changes sharply. Sharp changes in image properties usually reflect important events and changes in properties of the image. Edge detection of an image reduces significantly the amount of data and filters out information that may be regarded as less relevant, preserving the important structural properties of an image. The methods for edge detection can be grouped into two categories, search-based and zero-crossing based. The search-based methods detect edges by looking for maxima and minima in the first derivative of the image, usually local directional maxima of the gradient magnitude. Computer Vision By Dr Mohamed Berbar 61 The zero-crossing based methods search for zero crossings in the second derivative of the image in order to find edges, usually the zero-crossings of the Laplacian or the zero-crossings of a non-linear differential expression. In computer vision, edge detection is traditionally implemented by convolving the signal with some form of linear filter, usually a filter that approximates a first or second derivative operator. An odd symmetric filter will approximate a first derivative, and peaks in the convolution output will correspond to edges (luminance discontinuities) in the image. Computer Vision By Dr Mohamed Berbar 62 First order differential methods of edge detection If we take the derivative of the intensity values across the image and find points where the derivative is a maximum, we will have marked our edges. Figure 1: An ideal step edge and its derivative profile Computer Vision By Dr Mohamed Berbar 63 In a discrete image of pixels we can calculate the gradient by simply taking the difference of grey values between adjacent pixels. Figure 2: An ideal step edge and its derivative profile Computer Vision By Dr Mohamed Berbar 64 The gradient of the image function I is given by the vector » = The magnitude of this gradient is given by and its direction by The simplest gradient operator is the Robert's Cross operator and it uses the masks Thus the Robert's Cross operator uses the diagonal directions to calculate the gradient vector. Computer Vision By Dr Mohamed Berbar 65 A 3 x 3 approximation to is given by the convolution mask -1 0 1 -1 0 1 -1 0 1 This defines for the Prewitt operator, and it detects vertical edges. The Sobel operator is a variation on this theme giving more emphasis to the centre cell. The Sobel approximation to is given by Computer Vision By Dr Mohamed Berbar 66 Similar masks are constructed to approximate ,thus detecting the horizontal component of any edges. Edge Detection Algorithm : 1- Both the Prewitt and Sobel edge detection algorithms convolve with masks to detect both the horizontal and vertical edge components; the resulting outputs are simply added to give a gradient map. 2- The magnitude of the gradient map is calculated and then input to a routine that suppresses (to zero) all but the local maxima. This is known as non-maxima suppression Algorithm. 3- The resulting map of local maxima is thresholded (small local maxima will result from noise in the signal) to produce the final edge map. It is the non-maxima suppression and thresholding that introduce non- linearities into this edge detection scheme. Computer Vision By Dr Mohamed Berbar 67 non-maxima suppression Algorithm where Edge direction = Computer Vision By Dr Mohamed Berbar 68 Computer Vision By Dr Mohamed Berbar 69 Computer Vision By Dr Mohamed Berbar 70 Computer Vision By Dr Mohamed Berbar 71 Computer Vision By Dr Mohamed Berbar 72 Start with direction 2 (5+6) mod 8 =3 then 4 (7+6) mod 8 =5 Start with direction 5 (4+7) mod 8 =3,4 then 5 Computer Vision By Dr Mohamed Berbar 73 Computer Vision By Dr Mohamed Berbar 74 Computer Vision By Dr Mohamed Berbar 75 Chain code We follow the contour in a clockwise manner and keep track of the directions as we go from one contour pixel to the next. 2 Trace the object outline 3 1 follow pixels on boundary Code directions of movement 0 4 Description is position independent, orientation dependent 7 Can use differential chain codes 5 6 Computer Vision By Dr Mohamed Berbar 76 The element sequence will be called the chain associated with the curve A. It can be expressed in the following forms : A = a1 a2 a3 a4... an = C ai 2 3 1 0 4 (xs,ys) A 7 5 6 The curve A is started at the point (xs,ys), referred to as start point of the chain (st_point) and connects the various curve points in accordance with the element code numbers 21012007670701 Computer Vision By Dr Mohamed Berbar 77 Chain code properties * Even codes {0,2,4,6} correspond to horizontal and vertical directions; odd codes {1,3,5,7} correspond to the diagonal directions. * Each code can be considered as the angular direction, in multiples of 45deg., that we must move to go from one contour pixel to the next. * The absolute coordinates [m,n] of the first contour pixel (e.g. top, leftmost) together with the chain code of the contour represent a complete description of the discrete region contour. Computer Vision By Dr Mohamed Berbar 78 * When there is a change between two consecutive chain codes, then the contour has changed direction. This point is defined as a corner. Perimeter From Chain Code Even codes have length 1 Odd codes have length 2 Perimeter length = #even + 2 #odd Area From Chain Code ???? See the book for more ….. Computer Vision By Dr Mohamed Berbar 79 Crack Codes An alternative to the chain code for contour encoding is to use neither the contour pixels associated with the object nor the contour pixels associated with background but rather the line, the "crack", in between. The "crack" code can be viewed as a chain code with four possible directions instead of eight. Computer Vision By Dr Mohamed Berbar 80 This is illustrated with the Figure. The chain code for the Figure, from top to bottom, is {5,6,7,7,0}. The crack code is {3,2,3,3,0,3,0,0}. Computer Vision By Dr Mohamed Berbar 81 Run codes A third representation is based on coding the consecutive pixels along a row--a run--that belong to an object by giving the starting position of the run and the ending position of the run. Disadvantages: Gives errors due to rotation What about this representation in noisy image ??? Computer Vision By Dr Mohamed Berbar 82 Computer Vision By Dr Mohamed Berbar 83 Segmentation using Deep Learning Chapter 4 Segmentation using CNN (Example 1) Segmentation of MRI brain images Brain tissues like white matter (WM), gray matter (GM), and cerebrospinal fluid (CSF) Segmentation of MRI brain images using CNN (Example 1) Segmentation of MRI brain images Normalization The mx represents the maximum gray value in the data set and mn represents the minimum gray value in the data set. The normalized 3D data are subjected to contrast Local adaptive Histogram equalization. The CLAHE generates better results than adaptive histogram equalization, since it operates on the smaller regions of image called tiles or blocks rather than on the entire image. CNN architecture The CNN architecture comprises the following: convolution layer (CL), pooling layer and fully connected layer. The 3D CNN architecture comprises 2 CL, 2 max pooling layers, a fully connected layer and an output layer. Segmentation using CNN (Example 1) Segmentation of MRI brain images The first CL comprises 6 filters with a kernel size of 5 × 5 and second CL comprises 12 filters with a kernel size of 5 × 5. The pooling layer comprises filters with kernel size 2 × 2 and max pooling is employed. Convolution layer: It is termed as the heart of CNN and does most of the computation. The vital parameter of the CL is the set of filters or kernels. General template of filter is m × m × 3, where m represents the size of filter and 3 is for the red-green-blue (RGB) images. The convolution operation is performed by sliding the filter over the entire image. Pooling layer: The pooling layers are inserted successively between the CL s in the architecture. The objective of the pooling layer is to minimize the size of representation to reduce the number of parameters and computation in the network. The general form of pooling layer comprises filters of size 2 × 2 with a stride of 2 down samples. Segmentation using CNN (Example 2) Fully connected layer: The neurons in the fully connected layer have full connections for the activation response functions in the predecessor layer. The activation is determined with a matrix multiplication followed by bias effect. The training phase comprises making the fully connected CNN architecture to recognize the WM, GM and CSF. The gray value features are extracted from the images corresponding to brain tissues WM, GM and CSF. The threshold values of 192,254 and 128 are set for GM, WM and CSF as prescribed by the database (www.bic.mni.mcgill.ca/brainweb). For performance validation, metrics are used. Example 2: U-Net U-Net is a convolutional neural network that was developed for biomedical image segmentation. U-Net is basically a semantic segmentation architecture. semantic segmentation is the task of assigning a class to each pixel in an image.. It is a supervised learning model whose performance depends on how well the ground truth is annotated. Models are trained using segmentation maps as target variables. The unique difference between encoder decoder architecture and U- Net makes the U-Net network more robust with a limited dataset. In CNN, the image is converted into a vector which is largely used in classification problems. But in U-Net, an image is converted into a vector and then the same mapping is used to convert it again to an image. UNET is a U-shaped encoder-decoder network architecture, which consists of four encoder blocks and four decoder blocks that are connected via a bridge This converts the segmentation problem into a classification problem where we need to classify each pixel to one of the classes. The types of pooling are average pooling, sum pooling and max pooling. The pooling layer is also called as down sampling layer, since it minimizes the features and max pooling is employed here. Two 3X3 Conv layers and 2X2 up convolution layer. U-Net consists of Convolution Operation, Max Pooling, ReLU Activation, Concatenation and Up Sampling Layers Residual UNet A deep residual network contains a set of residual blocks, each of these sets consists of stacked layers such as batch normalization (BN), ReLU activation, and weight layer (i.e. convolutional layer). Residual Unet (Example 3) Encoder Path: The input image is resized to 128 × 128 and then batch normalization is performed on this batch. After batch normalization 2D convolution is carried out with filter size 3 × 3. Decoder Path — The decoder section implements Residual U-Net consists of UpSampling layer, concatenation layer followed by a stack of convolutional, BN, and ReLU activation. The used Layers What are ReLU layers in CNN? A Rectified Linear Unit(ReLU): A ReLU layer performs a threshold operation to each element of the input, where any value less than zero is set to zero. This operation is equivalent to. f ( x ) = { x , x ≥ 0 0 , x < 0. What is up sampling layer? The Up-sampling layer is a simple layer with no weights that will double the dimensions of input. Every Encoder (contraction) block gets an input, applies two 3X3 convolution ReLu layers and then a 2X2 max pooling. The number of feature maps gets double at each pooling layer. The bottleneck layer uses two 3X3 Conv layers and 2X2 up convolution layer. The Convolutions and the convolutional layer