Week 2 - Unit I - Image Fundementals - II.pptx.pdf
Document Details
Uploaded by FrugalAlpenhorn
Gyalpozhing College of Information Technology
Tags
Full Transcript
Unit I: Image Fundamentals Image as NumPy, Image transformation CSA301 Deep Learning Learning Outcomes Understand how images are represented as arrays Modify image pixels Describe and explain different image transformations Explain different colour space used in image Virtual environment A vi...
Unit I: Image Fundamentals Image as NumPy, Image transformation CSA301 Deep Learning Learning Outcomes Understand how images are represented as arrays Modify image pixels Describe and explain different image transformations Explain different colour space used in image Virtual environment A virtual environment is a tool that helps to keep dependencies required by different projects separate by creating isolated Python virtual environments. This is one of the most important tools that most Python developers use. Virtual environment - Installation Install Python3 from https://www.python.org/download/releases/3.0/ Navigate to the desired location in Windows from the command prompt and type: python3 –m venv “name of the environment” Here, we have created ImageLab virtual environment inside local disk D. Virtual environment - Installation Now navigate to the new environment using cd command: Activate the environment using: Virtual environment - Installation Check the list of packages installed in the environment: Here, we are viewing the lists of packages installed in the ImageLab virtual environment. Virtual environment - Installation Now you can install OpenCV and NumPy libraries: pip install numpy pip install opencv-python pip list packages Now we can see that Numpy and OpenCV libraries are installed in the virtual environment. Virtual environment - Installation Now you can install Visual Studio Code (VS Code) to write the code: https://code.visualstudio.com/ Open the VS Code from the command prompt using the following code: Once you hit the enter, VS Code will open. OpenCV? Officially launched in 1999 the OpenCV project was initially an Intel Research initiative. Written in C++ OpenCV is the one of the most popular computer vision libraries Widely used in commercial and academic purpose It has bindings with Python Official Website: https://opencv.org/ Loading and Displaying Image # Filename: load_display.py # import library import cv2 as cv # load the image img = cv.imread("lenna.png") # display the image cv.imshow("Original Image", img) cv.waitKey(0) https://en.wikipedia.org/wiki/Lenna Grayscale Conversion # grayscale conversion # import library gray = cv.cvtColor(img, cv.COLOR_BGR2GRAY) import cv2 as cv # display the grayscale image # load image cv.imshow("Grayscale", gray) img = cv.imread("Lenna.png") cv.waitKey(0) # display the original image cv.imshow("Original Image", img) Image as NumPy arrays Image processing libraries such as OpenCV and Scikit-image represent RGB images as multi-dimensional NumPy arrays with shape (height, width, depth). The reason for having height first is due to matrix notation. We usually write the dimensions of a matrix as rows x columns The height of an image is represented by the number of rows, while the width is represented by the number of columns Image as NumPy arrays Shape of the image # Filename: shape.py Image contains 382 rows # Editor: VS Code and 424 columns with 3 import cv2 as cv channels. img = cv.imread("logo.jpg") print(img.shape) cv.imshow("Image", img) cv.waitKey(0) Image as NumPy arrays Access pixel value # Filename: access_pixels.py # Editor: VS Code import cv2 as cv img = cv.imread("logo.jpg") (b, g, r) = img[200, 300] # access pixel at x = 300, y = 200 print("Blue Color: ", b) print("Green Color: ", g) print("Red Color: ", r) Image as NumPy arrays Edit the Pixels # Filename: edit_pixels.py # set the top-left corner of the original image to be green # Load the library image[0:50, 0:100] = (0, 255, 0) import cv2 as cv # Load the image # Show our updated image image = cv.imread("Lenna.png") cv.imshow("Updated", image) # Assign h and w cv.waitKey(0) (h, w) = image.shape[:2] cv.imshow("Original Image", image) Image as NumPy arrays Edit the Pixels Drawings in OpenCV OpenCV provides convenient and an easy-to-use methods to draw shapes on the image. Here, we will basically see three methods to draw shapes: ○ cv2.line(image, start coordinates, end coordinates, color, thickness) ○ cv2.rectangle(image, start coordinates, end coordinates, color, thickness) ○ cv2.circle(image, center coordinates, radius, color, thickness) Drawing Lines # Filename: draw_lines.py # import library import cv2 as cv # load image img = cv.imread("lenna.png") cv.imshow("Original Lena", img) # draw the blue line: from (0,0) to (100,100) of 3 pixel width cv.line(img, (0, 0), (100, 100), (255, 0, 0), 3) cv.imshow("Line", img) cv.waitKey(0) Drawing Rectangle # Filename: draw_rectangle.py # import library import cv2 as cv # load image img = cv.imread("lena.png") cv.imshow("Original Lena", img) # draw the rectangle # top-left corner and bottom-right corner cv.rectangle(img, (50, 50), (200, 200), (255, 0, 0), 3) cv.imshow("Rectangle", img) cv.waitKey(0) Drawing Circle # Filename: draw_circle.py # import library import cv2 as cv # load image img = cv.imread("lenna.png") cv.imshow("Original Lena", img) # draw the circle # Center coordinates and radius cv.circle(img, (200, 200), 100, (255, 0, 0), 3) cv.imshow("Circle", img) cv.waitKey(0) Inserting Text In OpenCV, cv2.putText() method is used to draw a text string on any image. cv2.putText(image, text, org, font, fontScale, color, thickness) image: It is the image on which text is to be drawn. text: Text spring to be drawn. org: is the coordinates of the bottom-left corner of the text string in the image Font: denotes the font type. Some of the font types are FONT_HERSHEY_SIMPLEX, FONT_HERSHEY_PLAIN fontScale: Font scale factor that is multiplied by the font-specific base size. color: It is the color of text string to be drawn. For BGR, we pass tuple. Eg: (255, 0, 0) for blue color. thickness: thickness of a font Inserting Text # Filename: insert_text.py # Import libraries import cv2 as cv # Load image img = cv.imread('lena.png') cv.imshow("Original Lena", img) # Text text = "LENA" img = cv.putText(img,text, (50, 50), cv.FONT_HERSHEY_SIMPLEX, 1, (255, 0, 0)) cv.imshow("Image", img) cv.waitKey(0) Image transformation - Translation Image translation is the shifting of an image along the x and y axis. Using translation, we can move the image up, down, left or right. If we know the amount of shift in horizontal and vertical direction, say (tx, ty), then we can make a transformation matrix, M: Image transformation - Translation Where tx represents the shift along the x-axis (horizontal), ty represents the shift along the y-axis (vertical). This can be implemented using the cv2.warpAffine() function, which takes the following parameters: ○ src: The image which is to be translated ○ M: The translation matrix ○ dsize: Output size Image transformation - Translation # Filename: translation.py shifted = cv.warpAffine( # Editor: VS Code img, M, import cv2 as cv (img.shape, img.shape) import numpy as np ) img = cv.imread('logo.jpg') cv.imshow("Shifted", shifted) cv.imshow("Image", img) cv.waitKey(0) cv.waitKey(0) # define the translation matrix M = np.float32([[1, 0, 25], [0, 1, 50]]) Here in the output, the image has been shifted 25 units to the x axis (right) and 50 units to the y-axis (downwards). Image transformation - Rotation Image rotation is done by a certain angle θ by defining a transformation matrix M. This matrix is usually of the form: OpenCV provides the getRotationMatrix2D() function to create the above transformation matrix. The following is the syntax for creating the 2D rotation matrix: ○ center: the center of rotation for the input image ○ angle: the angle of rotation in degrees ○ scale: an isotropic scale factor which scales the image up or down according to the value provided Image transformation - Rotation # Filename: rotation.py # get the rotation matrix # Editor: VS Code M = cv.getRotationMatrix2D(center, import cv2 as cv 45, 1) img = cv.imread("logo.jpg") rotated = cv.warpAffine(img, M, (w, cv.imshow("Original Image", img) h)) cv.waitKey(0) cv.imshow("Rotated by 45 Degrees", # extract height and width rotated) (h, w) = img.shape[:2] cv.waitKey(0) center = (w // 2, h // 2) Image transformation - Scaling Scaling is the method of resizing the image according to the requirement. We perform two things in the image scaling: either we enlarge the image or we shrink the image. The code in the next slide resizes the image to 200 by 200 Image transformation - Scaling # Filename: resize.py # Editor: VS Code import cv2 as cv img = cv.imread("logo.jpg") cv.imshow("Original Image", img) cv.waitKey(0) img_shrinked = cv.resize(img,(200, 200), interpolation = cv.INTER_AREA) cv.imshow("Shrinked Image", img_shrinked) cv.waitKey(0) Image transformation - Flipping OpenCV provides methods to flip an image across its x or y-axis. Though flipping operations are used less often, they are still very valuable to learn. For example, if you want to build a face classifier and if you have fewer images of the face, then you can use the flip method to augment our dataset. Image transformation - Flipping # Filename: flip.py # Editor: VS Code import cv2 as cv img = cv.imread("logo.jpg") cv.imshow("Original Image", img) cv.waitKey(0) flipped = cv.flip(img, 1) cv.imshow("Flipped Horizontally", flipped) cv.waitKey(0) Note: Flip around the y axis Image transformation - Cropping Cropping an image refers to extracting a segment of that image. It is a process of selecting a region of interest - ROI For instance, in the face recognition application, we may want to extract the face from the given image or video. OpenCV represent images as NumPy arrays. So, we can use slicing technique to crop the image. To crop the images, we need to know the coordinates of the ROI. (Here we will use our a priori knowledge of the image and manually supply the NumPy array slices) Image transformation - Cropping Face = img[startY:endY, startX:endX] OpenCV represents images as NumPy arrays with the height first (number of rows) and the width second (number of columns) startY: The starting y-coordinate endY: The ending y-coordinate startX: The staring x-coordinate endX: The ending x-coordinate Image transformation - Cropping Lets crop the face of Lena - (90, 80), (155, 170) # Filename: crop_image.py import cv2 as cv img = cv.imread('lenna.png') cv.imshow("Original Lena", img) cv.waitKey(0) # Coordinates of face face = img[80:170, 90:155] cv.imshow("Face Cropped", face) Coordinates Finder: https://pixspy.com/ cv.waitKey(0) Color space A color model is an abstract mathematical model that describes the how colors can be represented as a set of numbers. Color models can usually be described using a coordinate system, and each color in the system is represented by a single point in the coordinate space. Color space - RGB RGB color model stores individual values for red, green, and blue. With a color space based on the RGB color model, the three primaries are added together to create colors from completely white to completely black. Color space - HSV HSV (hue, saturation, value), also known as HSB (hue, saturation, brightness), is often used by artists because it is often more natural to think about a color in terms of hue and saturation than in terms of additive or subtractive color components. The system is closer to people’s experience and perception of color than RGB. For example, in painting terms, hue, saturation, and values are expressed in terms of color, shading, and toning. Color space - HSV HSV is a cylindrical color model that remaps the RGB primary colors into dimensions that are easier for humans to understand. Color space - HSV Color space - HSV Color space - HSV Value: Controls the brightness of the color. A color with 0% brightness is pure black while a color with 100% brightness has no black mixed into the color. Value works in conjunction with saturation and describes the brightness or intensity of the color, from 0 to 100 percent, where 0 is completely black, and 100 is the brightest and reveals the most color. Color space - HSV import cv2 as cv bgr_img = cv.imread('logo.jpg') hsv_img = cv.cvtColor(bgr_img, cv.COLOR_BGR2HSV) print(hsv_img) cv.imshow('HSV image', hsv_img) cv.waitKey(0) Application: Color Tracking Program to track blue color using OpenCV and Python Application: Color Tracking In OpenCV, the values of the Hue channel range from 0 to 179, whereas the Saturation and Value channels ranges from 0 to 255. In OpenCV, to convert an RGB image to HSV image, we use the cv2.cvtColor() function. This function is used to convert an image from one color space to another. HSV Color Range in OpenCV ❑ Hue – [0, 179] ❑ Saturation – [0, 255] ❑ Value – [0, 255] Application: Color Tracking # Filename: color_track.py import numpy as np import cv2 as cv # Capture video through webcam cap = cv.VideoCapture(0) while True: # Read frame by frame ret, frame = cap.read() # Convert the BGR to HSV hsv = cv.cvtColor(frame, cv.COLOR_BGR2HSV) Application: Color Tracking # Define the lower and upper range of blue color lower_blue = np.array([90, 50, 50]) upper_blue = np.array([130, 255, 255]) # Mask the colors mask = cv.inRange(hsv, lower_blue, upper_blue) cv.imshow('Mask', mask) Whenever we want to check the elements of a given array with the corresponding elements of the two arrays among which one array represents the upper bound and the other array represents the lower bounds. inRange() function returns: ○ 255 if the elements of the given array lie between the two arrays representing the upper bounds and the lower bounds or ○ 0 if the elements of the given array are do not lie between two bounds. Application: Color Tracking result = cv.bitwise_and(frame, frame, mask = mask) cv.imshow('Frame', result) if cv.waitKey(1) == ord('q'): break A bitwise AND is true if and only if both pixels are greater than zero. First frame is the input1, second frame is the input2 and the mask is used for applying the function only on the marked area. inRange bitwise_and Application: Color Tracking Green H: 50-70 S: 100 -255 V: 100 -255 Red: H: 0-10 S: 70-255 V: 50-255 Application: Color Tracking [Challenge] Activity: Write a Python program to detect any HSV color using trackbar in OpenCV THANK YOU