Computer Vision Lecture Notes PDF

Computer Vision By Dr. Ahmed Taha Lecturer, Computer Science Department, Faculty of Computers & Artificial Intelligence, Benha University 1 Image Basics Lecture Three 2 3 4 So what is an...

Computer Vision By Dr. Ahmed Taha Lecturer, Computer Science Department, Faculty of Computers & Artificial Intelligence, Benha University 1 Image Basics Lecture Three 2 3 4 So what is an image? 5 Eyes: projection onto retina 6 Model: pinhole camera 7 Model: pinhole camera 8 Image: 3d -> 2d projection of the world 9 Image: 3d -> 2d projection of the world 10 Image: 3d -> 2d projection of the world 11 Image: 3d -> 2d projection of the world 12 Image: 3d -> 2d projection of the world 13 Image: 3d -> 2d projection of the world 14 At each point we record incident light 15 At each point we record incident light 16 How do we record color? 17 Traditional Camera Digital Camera A CCD sensor A CMOS sensor Digital Camera Beam Splitter A spinning disk filter Bayer pattern for CMOS sensors 21 An image is a matrix of light 22 much light 23 much light - Higher = more light - Lower = less light - Bounded - No light = 0 - Sensor/device limit = max - Typical ranges: -[0-255], fit into byte - [0-1], floating point - Called pixels 24 Addressing pixels - Ways to index: - (x,y) - Like cartesian coordinates - (3,6) is column 3 row 6 - (r,c) - Like matrix notation - (3,6) is row 3 column 6 - I use (x,y) - So does your homework! - Arbitrary - Only thing that matters is consistency 25 Color image: 3d tensor in colorspace 26 RGB information in separate “channels” Remember: we can match “real” colors using a mix of primaries. Each channel encodes one primary. Adding the light produced from each primary mimics the original color. 27 Addressing pixels - I use (x,y,c) - (1,2,0): - column 1, row 2, channel 0 - Still doesn’t matter, just be consistent - But do what I do for homeworks :-) - Also for size: - 1920 x 1080 x 3 image: - 1920 px wide - 1080 px tall - 3 channels 28 How do we store them? 129 131 152 159 135 163 142 82 182... 29 Storage: row major vs column major 30 Storage: row major vs column major HW WH 31 HW 32 choices! 33 HWC: channels interleaved 129 131 152 159 135 163 142 82 182... 34 CHW: channels separated 129 131 152... 135 163 142... 182... 35 CHW Pop quiz We’ll use CHW, it’s what a lot of other libraries use. In an array for a 1920 x 1080 x 3 image what entry would contain the pixel (15,192,2)? In groups, discuss for 2 minutes. 36 CHW Pop quiz In an array for a 1920 x 1080 x 3 image what entry would contain the pixel (15,192,2)? In general for (x,y,z) of image (W,H,C) x + y*W + z*W*H 15 + 192*1920 + 2*1920*1080 = 4,515,855 Remember, everything is 0 indexed This isn’t MATLAB 37 In your homework typedef struct { int w,h,c; float *data; } image; 38 Fun with other colorspaces! 39 Other colorspaces are fun! 40 Geometric HSV to RGB: 41 Still 3d tensor, different info 42 Hue Saturation Value 43 More saturation = intense colors 2x 44 More value = lighter image 2x 45 Shift hue = shift colors -.2 46 Set hue to your favorite color! 47 Or pattern... 48 saturation 49 50 saturation 51 exposure Similarly, we can very easily manipulate the exposure of an image by modifying its value 52 Image interpolation and resizing 53 An image is kinda like a function An image is a mapping from indices to pixel value: - Im: I x I x I -> R We may want to pass in non- integers: - Im’: R x R x I -> R 54 A note on coordinates in images 55 A note on coordinates in images 56 A note on coordinates in images 57 A note on coordinates in images 58 A note on coordinates in images 59 A note on coordinates in images 60 A note on coordinates in images This point is: (-.25, -.25) 61 Just be careful, lots of pitfalls This point is: (-.25, -.25) 62 Nearest neighbor: what it sounds like f(x,y,z) = Im(round(x), round(y), z) - Looks blocky - Common pitfall: Integer division rounds down in C - Note: z is still int 63 Triangle interpolation: for less structured image Sometimes you have a regular grid, sometimes you don’t. When you don’t look for triangles! 64 Triangle interpolation: for less structured image Sometimes you have a regular grid, sometimes you don’t. When you don’t look for triangles! 65 Triangle interpolation: for less structured image Sometimes you have a regular grid, sometimes you don’t. When you don’t look for triangles! 66 Triangle interpolation: for less structured image Sometimes you have a regular grid, sometimes you don’t. When you don’t look for triangles! 67 Triangle interpolation: for less structured image Weighted sum using of triangles: Q = V1*A1 + V2*A2 + V3*A3 Should normalize this based on total area 68 Bilinear interpolation: for grids, pretty good This time find the closest pixels in a box 69 Bilinear interpolation: for grids, pretty good This time find the closest pixels in a box 70 Bilinear interpolation: for grids, pretty good This time find the closest pixels in a box 71 Bilinear interpolation: for grids, pretty good This time find the closest pixels in a box Same plan, weighted sum based on area of opposite rectangle Q = V1*A1 + V2*A2 + V3*A3 + V4*A4 Still need to normalize! Or do we? 72 Bilinear interpolation: for grids, pretty good Alternatively, linear interpolation of linear interpolates q1 = V1*d2 + V2*d1 q2 = V3*d2 + V4*d1 q = q1*d4 + q2*d3 73 Bilinear interpolation: for grids, pretty good q1 = V1*d2 + V2*d1 q2 = V3*d2 + V4*d1 q = q1*d4 + q2*d3 Equivalent: q = q1*d4 + q2*d3 74 Bilinear interpolation: for grids, pretty good q1 = V1*d2 + V2*d1 q2 = V3*d2 + V4*d1 q = q1*d4 + q2*d3 Equivalent: q = q1*d4 + q2*d3 q = (V1*d2 + V2*d1)*d4 + (V3*d2 + V4*d1)*d3 (subst) 75 Bilinear interpolation: for grids, pretty good q1 = V1*d2 + V2*d1 q2 = V3*d2 + V4*d1 q = q1*d4 + q2*d3 Equivalent: q = q1*d4 + q2*d3 q = (V1*d2 + V2*d1)*d4 + (V3*d2 + V4*d1)*d3 (subst) q = V1*d2*d4 + V2*d1*d4 + V3*d2*d3 + V4*d1*d3 (distribution) 76 Bilinear interpolation: for grids, pretty good q1 = V1*d2 + V2*d1 q2 = V3*d2 + V4*d1 q = q1*d4 + q2*d3 Equivalent: q = q1*d4 + q2*d3 q = (V1*d2 + V2*d1)*d4 + (V3*d2 + V4*d1)*d3 (subst) q = V1*d2*d4 + V2*d1*d4 + V3*d2*d3 + V4*d1*d3 (distribution) Recall: A1 = d2*d4 A2 = d1*d4 A3 = d2*d3 A4 = d1*d3 77 Bilinear interpolation: for grids, pretty good q1 = V1*d2 + V2*d1 q2 = V3*d2 + V4*d1 q = q1*d4 + q2*d3 Equivalent: q = q1*d4 + q2*d3 q = (V1*d2 + V2*d1)*d4 + (V3*d2 + V4*d1)*d3 (subst) q = V1*d2*d4 + V2*d1*d4 + V3*d2*d3 + V4*d1*d3 (distribution) Recall: A1 = d2*d4 A2 = d1*d4 A3 = d2*d3 A4 = d1*d3 q = V1*A1 + V2*A2 + V3*A3 + V4*A4 78 Bilinear interpolation: for grids, pretty good q1 = V1*d2 + V2*d1 q2 = V3*d2 + V4*d1 q = q1*d4 + q2*d3 Equivalent: q = q1*d4 + q2*d3 q = (V1*d2 + V2*d1)*d4 + (V3*d2 + V4*d1)*d3 (subst) q = V1*d2*d4 + V2*d1*d4 + V3*d2*d3 + V4*d1*d3 (distribution) Recall: A1 = d2*d4 A2 = d1*d4 A3 = d2*d3 A4 = d1*d3 q = V1*A1 + V2*A2 + V3*A3 + V4*A4 yay! 79 Bilinear interpolation: for grids, pretty good - Smoother than NN - More complex - 4 lookups - Some math - Often the right tradeoff of speed vs final result 80 Bicubic sampling: more complex, maybe better? - A cubic interpolation of 4 cubic interpolations - Smoother than bilinear, no “star” - 16 nearest neighbors - Fit 3rd order poly: - f(x) = a + bx + cx^2 + dx^3 - Interpolate along axis - Fit another poly to interpolated values 81 Bicubic vs bilinear 82 Bicubic vs bilinear 83 Resize algorithm: - For each pixel in new image: - Map to old im coordinates - Interpolate value - Set new value in image 84 What about shrinking? - NN and Bilinear only look at small area - Lots of artifacting - Staircase pattern on diagonal lines - We’ll fix this next class with filters! 85 86

Computer Vision Lecture Notes PDF

Document Details

Tags

Related

Summary

Full Transcript