Week 1 - Unit I - Image Fundamentals - PDF
Document Details
Uploaded by FrugalAlpenhorn
Gyalpozhing College of Information Technology
Tags
Summary
This document provides an introduction to Image Fundamentals, covering topics such as pixels, image channels, coordinate systems, and image processing techniques. It's suitable for an introductory computer vision course at an undergraduate level.
Full Transcript
Unit I: Image Fundamentals Images, pixels, image channels, coordinate system CSA301 Deep Learning Learning Outcomes Define and explain the building blocks of an images Explain how image is formed from the channels Describe the overview of the image coordinate system What are images? Befo...
Unit I: Image Fundamentals Images, pixels, image channels, coordinate system CSA301 Deep Learning Learning Outcomes Define and explain the building blocks of an images Explain how image is formed from the channels Describe the overview of the image coordinate system What are images? Before building our own classifier, we need to first understand what an image is. Humans can see through the eyes by transforming light into electrical signals that are then processed by the brain. Computers can process information in digital forms composed of bits (0 or 1) Image is a 2-D representation of a visible light spectrum. Each pixel in an image reflects different wavelengths of light which correspond to different colours. What are images? Electronic magnetic spectrum Pixels: The building blocks of images Pixels are the smallest item of information in an image. Pixels are arranged in a 2-dimensional grid, represented using squares. A pixel is the color or intensity of light that appears at a specific location in our image. Each pixel is a sample of an original image, where more samples typically provide more accurate representations of the original. There is no finer granularity than the pixel. Pixels: The building blocks of images The image resolution of 300 by 200 means there are 300 rows (height) and 200 columns (width). Overall, there are 300 x 200 = 60,000 total pixels in the image. A digital image can be composed of one channel for a black and white image or three channels (red, blue and green) for a color image. Pixels: The building blocks of images Pixels: The building blocks of images Grayscale/single channel In a grayscale image, each pixel is a scalar value between 0 and 255, where zero corresponds to “black” and 255 is “white”. Values between 0 and 255 are varying shades of grey, where values closer to 0 are darker and values closer to 255 are lighter. Pixels: The building blocks of images Color RGB (red, green, blue) color space is commonly used to represent color pixels. In the RGB color space, pixels are no longer represented by a single scalar value as they are in a grayscale image; instead, they are represented by a list of three values: RED, GREEN, and BLUE. Each RGB channel can have values between 0 and 255, with 0 denoting no representation and 255 denoting full representation. Pixels: The building blocks of images Additive color space Color Red + Green = Yellow (255, 255, 0) Red + Blue = Pink (255, 0, 255) All colors = ? (___, ___, ___) Formation of image from channels The RGB image that we perceive through our eyes is represented by three values: Red, Green and Blue components. RGB image is made up of three distinct matrices of width W and height H, one for each of the RGB components. These three matrices can be combined to form a multi-dimensional array with the shape W x H x D, where D denotes the depth or number of channels (in our case, D = 3). Formation of image from channels RGB image in three independent matrices. Image coordinate system (0, 0) corresponds to the top-left pixel in our image, whereas the point (7, 7) corresponds to the bottom-right corner. Counting start from zero rather than one. Python language is zero-indexed, meaning that we always start counting from zero. Image processing tools Image processing is a technique for manipulating images in order to enhance them or extract useful information from it. In the design of an image classifier, image processing is a critical step. The collected image data may contain noise which may hamper the accuracy of the classification model. Therefore, to clean the image data (resizing, cropping, data augmentation), image processing techniques are used. Some of the image processing tools are: OpenCV, Scikit-image, SimpleCV THANK YOU