Lecture 3: Geometric & Spatial Operations PDF
Document Details
Uploaded by LushStatistics
Tags
Summary
This document details the mathematical operations for image transformations (scaling, rotation, and translation) commonly used in computer vision and image editing. It describes the algorithms and their practical applications, providing theoretical background of resizing, shifting, and rotating images.
Full Transcript
Lecture 3 4. Geometric Operations 5. Spatial Operations Geometric Operations: Transforming Image Shape and Position 1. Introduction to Geometric Operations - Definition: Techniques that change the spatial arrangement of pixels in an image. - Purpose: To resize, move, or rotate images without...
Lecture 3 4. Geometric Operations 5. Spatial Operations Geometric Operations: Transforming Image Shape and Position 1. Introduction to Geometric Operations - Definition: Techniques that change the spatial arrangement of pixels in an image. - Purpose: To resize, move, or rotate images without changing their content. - Importance: Fundamental for image alignment, perspective correction, and composition creating visual effects. What Are They? - Like moving, stretching, or rotating a rubber sheet with a picture on it - Changes where pixels are located without changing their colors - Fundamental operations in image editing and computer vision wrapAffin(center of rotation, rotation matrix, angle in degrees, scalling factor Real-World Analogies: - Like adjusting a photo in a physical photo album - Similar to moving, resizing, or rotating shapes in PowerPoint - Think of manipulating a piece of stretchy fabric with a pattern on it 2. Scaling: Resizing Images What is Scaling? - Changing the size of an image by making it larger or smaller. - Can be uniform (same in all directions) or non-uniform (different in horizontal and vertical directions). Multiply the same scaling factor to x , y axis Multiply the different scaling factor to x , y a How Scaling Works: 1. Determine the scaling factor (e.g., 2x for doubling size, 0.5x for halving). 2. For each pixel in the new image: - Calculate its corresponding position in the original image. - Assign a color based on the original image's pixels. Basic Concept: Original (4x4): Scaled Up (8x8): Scaled Down (2x2): [][][][] [][][][]... [][] [][][][] [][][][]... [][] [][][][] [][][][]... [][][][] [][][][]...... 2. Scaling: Resizing Images Uses of Scaling الصور اللى بتكون واجهة لفيديوهات اليوتيوب - Resizing images for different display sizes (e.g., thumbnails) - Preparing images for printing at different resolutions - Zooming in or out in photo editing software Scaling up (zooming in) increases the dimensions (e.g., resolution, wid Example: height). Scaling down (zooming out) decreases the dimensions. - Original: 1000x1000 pixel image - Scaled: 500x500 pixel image (downscaled) or 2000x2000 pixel image (upscaled) 2. Scaling: Resizing Images Types of Scaling: 1.Upscaling: (Making Larger) - Like stretching a rubber band - Creates new pixels between existing ones - Challenge: Need to fill in new spaces Example: Original: Upscaled 2x: [A][B] [A][?][B] [C][D] [?][?][?] [C][?][D] Making the image larger - Challenge: Need to create new pixel information Types of Scaling: 2. Downscaling: Making the image smaller - Challenge: Need to combine existing pixel information 2. Downscaling (Making Smaller) - Like squishing a sponge - Combines existing pixels - Challenge: Deciding what information to keep Example: Original: Downscaled: [A][B][C][D] [AB] [E][F][G][H] [EF] [I][J][K][L] [M][N][O][P] Interpolation Methods: We don’t made up any new intensity values - Nearest Neighbor: Fast but can look blocky كل بكسل جديد بياخد قيمة البكسل األقرب منه و بس 1. Nearest Neighbor Original: After Scaling: - Pros: Fast, preserves sharp edges - Cons: Can look blocky - Use when: Working with pixel art or icons Interpolation Methods: - Bilinear: Smoother results, uses weighted average of neighboring pixels 2. Bilinear Consider 2*2 “4” neighboring pixels Original: After Scaling: [1.0][1.5][2.0] [2.0][2.5][3.0] [3.0][3.5][4.0] خد اقرب كل تو نيبورز لكل واحد - Pros: Smoother results - Cons: Can blur sharp edges - Use when: Working with photographs Interpolation Methods: - Bicubic: Even smoother, considers larger neighborhood 3. Bicubic - Considers a 4x4 grid of surrounding pixels 16” neighboring pixels - Most sophisticated method - Best for high-quality photo resizing 3. Translation: Moving Images What is Translation? - Shifting an image in a straight line without changing its orientation or size. - Can be horizontal, vertical, or both. Basic Concept: - Like sliding a paper on a desk - Moves all pixels by the same amount in the same direction How Translation Works: 1. Choose the direction and distance to move the image. 2. For each pixel in the new image: - Calculate its corresponding position in the original image. - Assign the color from the original position to the new position. Visual Example: Original Position: Translated Right: [][ ][ ] [ ][][ ] [][ ][ ] → [ ][][ ] [][ ][ ] [ ][][ ] 3. Translation: Moving Images Handling Edges: - When moving an image, some areas may have no corresponding pixels in the original. - Options for these areas: 1. Fill with a solid color (e.g., black or white) 2. Wrap around (pixels from one edge appear on the opposite edge) 3. Extend edge pixels 1. Fill with a solid color (e.g., black or white) Edge Handling Methods: 2. Wrap around (pixels from one edge appear on the opposite edge) Zero Fill Wrap Around Original: Moved Right: Original: Moved Right: [A][B] [B][A][B] [A][B] [A][B] [C][D] [D][C][D] [C][D] [C][D] 3. Extend edge pixels Border Extension Original: Moved Right: [A][B] [A][A][B] [C][D] [C][C][D] 3. Translation: Moving Images Handling Edges: Uses of Translation: - Aligning multiple images (e.g., in panorama stitching) - Centering objects within a frame - Creating simple animations or slideshows Example: - Original: Image of a car on the left side - Translated: Same image with the car moved to the right side 4. Rotation: Spinning Images What is Rotation? - Turning an image around a fixed point, usually the center. - Measured in degrees or radians. Basic Concept: - Like turning a wheel - Spins image around a center point - Uses trigonometry to calculate new positions Visual Example: Original: 90° Rotation: 180° Rotation: اتحر180 اتحرك حركة واحدة لو90 الحركة عكس عقارب الساعة لو Challenges in Rotation: - Pixels may not align perfectly after rotation, requiring interpolation. - Corners of the image may be cut off or create empty spaces. Original: 45° Rotation: [][][] [ ][][ ] [][][] → [][][] [][][] [ ][][ ] - Empty corners need filling - Pixels may land between grid points الروتيشن مش بيكبر الكانفس 5. Combining Operations Transformation Matrices: - All these operations can be represented using matrices. - Combining operations becomes a matter of multiplying matrices. Order of Operations: - The order in which you apply transformations matters. - Example: Rotating then translating gives a different result than translating then rotating. Real-world Applications: - Image registration: Aligning images from different sources or times - Computer graphics: Creating 3D effects in 2D images - Augmented reality: Placing virtual objects in real-world images النضارة بتاعت الواقع االفتراضى Handling Rotation Challenges: - Interpolation: Similar methods to scaling (nearest neighbor, bilinear, bicubic). - Empty spaces: Can be filled with a background color or the image can be cropped. 1. Empty Space Handling - Fill with background color - Crop to remove empty space - Expand canvas to show full rotation 2. Pixel Position Handling - Use interpolation methods (like in scaling) - Round to nearest position - Weight based on distance 5. Combining Operations Example Sequence: Original: Scaled: Rotated: Translated: [][] [][][] [ ][][ ] [ ][] [][] → [][][] → [][][] → [ ][][] [][][] [ ][][ ] [][][ ] Order Matters: 1. Scale then Rotate: [A][B] → [A][A][B] → [C][A][C] [C][D] [C][C][D] [D][B][D] 2. Rotate then Scale: [A][B] → [C][A] → [C][C][A] [C][D] [D][B] [D][D][B] 6. Practical Applications Photo Editing: - Resizing for social media - Creating thumbnails - Rotating skewed photos - Centering subjects Computer Vision: - Aligning images for comparison - Preparing training data - Object tracking - Face alignment Medical Imaging: - Aligning scans from different times - Standardizing image orientation - Creating consistent views 7. Best Practices Quality Considerations: 1. Minimize Operations - Each transformation can lose quality - Combine operations when possible - Save final version in high quality 2. Choose Appropriate Methods - Use nearest neighbor for pixel art - Use bicubic for photos - Consider target use (web, print, analysis) 3. Preserve Aspect Ratio - Unless specifically needed - Prevents distortion - Maintains visual quality Performance Tips: 1. Operation Order - Downscale before other operations - Combine multiple translations احرك الحاجة على أجزاء مش مرة واحدة - Use matrix multiplication for efficiency 2. Memory Management - Consider image size limits - Use appropriate data types - Clear unnecessary intermediates Conclusion, Key Takeaways: 1. Understanding basic operations: - Scaling changes size - Translation moves position - Rotation changes orientation 2. Quality vs. Speed: - Better quality takes more time - Choose methods based on needs - Consider final use case 3. Common Pitfalls: - Loss of quality in multiple operations - Unexpected empty spaces - Incorrect operation order Conclusion: The Power of Geometric Operations - These techniques allow us to: 1. Adapt images for various display or printing needs (scaling) 2. Reposition elements within an image (translation) 3. Change the orientation of images or objects within them (rotation) 4. Combine operations for complex transformations - Remember: - Always consider the purpose of the transformation - Be aware of potential loss of information, especially in downscaling and rotation - These operations are the building blocks for more complex image manipulations Spatial Operations Spatial Operations in Image Processing 1. Introduction to Spatial Operations - Definition: Techniques that modify pixel values based on the surrounding pixels. - Purpose: To enhance images, detect features, or remove noise. - Importance: Foundation for many advanced image processing and computer vision tasks 1. Understanding Spatial Operations What Are They? - Operations that look at pixels and their neighbors - Like a chef tasting not just one ingredient, but how ingredients work together - Changes pixel values based on surrounding pixels' values Real-World Analogies: - Like smoothing wood with sandpaper - Similar to blending colors in painting - Think of focusing/defocusing a camera lens 2. Convolution: The Building Block of Spatial Operations What is Convolution? - A mathematical operation that combines two functions to produce a third function. - In image processing, it's like sliding a small window (kernel) over an image and performing calculations. Key Concepts: 1. Kernel (or Filter): A small matrix of numbers. “square” 2. Sliding Window: The kernel moves across the entire image. 3. Weighted Sum: Each output pixel is a sum of weighted neighbor pixels. 2. Convolution: The Heart of Spatial Operations Step-by-Step Convolution Process Starting Position Original Image (5x5): Kernel (3x3): × Sum of kernel weights = 16 Step 1: First Position (Top-Left) B. Multiply Corresponding Values: A. Position Kernel: (1×1) + (2×2) + (1×1) + (3×2) + (4×4) + (2×2) + × (2×1) + (3×2) + (4×1) C. Calculate: D. Normalize (divide by sum of kernel weights): 1+4+1+ 6 + 16 + 4 + 44 ÷ 16 = 2.75 (rounds to 3) 2 + 6 + 4 = 44 Step 2: Move One Position Right A. Position Kernel: B. Multiply and Sum: (2×1) + (1×2) + (3×1) + × (4×2) + (2×4) + (1×2) + (3×1) + (4×2) + (2×1) C. Calculate: D. Normalize: 2+2+3+ 38 ÷ 16 = 2.375 (rounds to 2) 8+8+2+ 3 + 8 + 2 = 38 Continue Process The kernel continues sliding right until it reaches the end of the row, then moves down one row and starts from the left again. Final Result After Complete Process: Original Image: Result Image: → Visual Progression of Kernel Movement: Step 1: Step 2: Step 3: [][][][ ][ ] [ ][][][][ ] [ ][ ][][][] [][][][ ][ ] [ ][][][][ ] [ ][ ][][][] [][][][ ][ ] [ ][][][][ ] [ ][ ][][][] [ ][ ][ ][ ][ ] [ ][ ][ ][ ][ ] [ ][ ][ ][ ][ ] [ ][ ][ ][ ][ ] [ ][ ][ ][ ][ ] [ ][ ][ ][ ][ ] Step 4: Step 5: Step 6: [ ][ ][ ][ ][ ] [ ][ ][ ][ ][ ] [ ][ ][ ][ ][ ] [][][][ ][ ] [ ][][][][ ] [ ][ ][][][] [][][][ ][ ] [ ][][][][ ] [ ][ ][][][] [][][][ ][ ] [ ][][][][ ] [ ][ ][][][] [ ][ ][ ][ ][ ] [ ][ ][ ][ ][ ] [ ][ ][ ][ ][ ] Key Points to Remember: 1. The kernel moves one pixel at a time, from left to right, top to bottom. 2. Each output pixel is calculated using the 9 pixels under the kernel. 3. The result is normalized by dividing by the sum of kernel weights (16 in this case). 4. Edge pixels require special handling (padding) since the kernel extends beyond the image boundaries. Effects of Different Kernel Values: 1. Using larger weights in the center: - Preserves original pixel values more - Less blurring effect To blur a pixel intensity, you need to replace its value with a weighted average of its neighboring pixel intensities , which effectively spreads the intensity across nearby pixels. 2. Using equal weights: This is commonly done using a convolution operation with a blurring kernel. Here’s what you should do: - Creates averaging effect A common choice is the box blur (all weights are equal) or the Gaussian blur - More blurring (center weight is highest, and weights decrease with distance). Example of a 3x3 Box Blur Kernel: 3. Using negative values: - Can detect edges - Can sharpen images This kernel averages the pixel intensity across a 3×33 \times 33×3 neighborhood. 2. Convolution: The Building Block of Spatial Operations How Convolution Works: 1. Position the kernel over a section of the image. 2. Multiply each kernel value with the corresponding image pixel. 3. Sum up all these multiplications. 4. Place the result in the output image at the position of the kernel's center. 5. Slide the kernel to the next position and repeat. Types of Kernels: - Blur kernels: Smooth out details and reduce noise. - Sharpen kernels: Enhance edges and fine details. - Edge detection kernels: Highlight boundaries between objects. Example: - Input: Clear image of a face - Convolution with blur kernel: Softened, slightly out-of-focus image - Convolution with sharpen kernel: More defined features, enhanced details Common Kernel Types: 2. Sharpen 1. Blur (Box Filter) [ 0][-1][ 0] [-1][ 5][-1] [ 0][-1][ 0] Effect: Enhances differences between pixels All values = 1/9 Effect: Averages nearby pixels 3. Edge Detection Visual Examples: Original: Blurred: [-1][-1][-1] Sharpened: [-1][ 8][-1] [-1][-1][-1] → or Effect: Highlights boundaries