OpenCV and Python for Image Processing

Choose a study mode

Play Quiz
Study Flashcards
Spaced Repetition
Chat to Lesson

Podcast

Play an AI-generated podcast conversation about this lesson

Questions and Answers

Which of the following best describes the primary purpose of OpenCV?

  • Database management
  • Computer vision tasks (correct)
  • Web development
  • Operating system development

NumPy must be installed separately before installing OpenCV to ensure proper functioning of matrix operations.

False (B)

To read an image file into OpenCV, the function cv.______() should be used.

imread

What is the common cause of a '-215 assertion failed' error when working with media files in OpenCV?

<p>The most common cause is that OpenCV cannot locate the specified media file, either due to an incorrect file path or the file not existing at the given location.</p> Signup and view all the answers

Match the cv.flip() flip codes with their corresponding actions:

<p>0 = Vertically flip the image 1 = Horizontally flip the image -1 = Vertically and horizontally flip the image</p> Signup and view all the answers

Which blurring technique is most effective at reducing 'salt and pepper' noise while preserving edges to a reasonable extent, compared to averaging or Gaussian blur?

<p>Median Blur (C)</p> Signup and view all the answers

Explain the fundamental difference between 'contours' and 'edges' in the context of image processing, as discussed in the material.

<p>Edges are points of significant intensity change in an image, often representing boundaries between regions. Contours, on the other hand, are the boundaries of objects themselves, formed by joining continuous edge points. Contours are object-based, while edges are pixel-based.</p> Signup and view all the answers

In cv.GaussianBlur(), the parameter that directly controls the standard deviation of the Gaussian kernel in the X direction is known as ______.

<p>sigmaX</p> Signup and view all the answers

To convert a grayscale image directly to HSV color space in OpenCV, which of the following sequences of conversions is required?

<p>Grayscale -&gt; BGR -&gt; HSV (C)</p> Signup and view all the answers

Bilateral blur utilizes a kernel size parameter, similar to averaging, Gaussian, and median blur, to define the area of pixels considered for blurring.

<p>False (B)</p> Signup and view all the answers

Flashcards

What is OpenCV?

A computer vision library supporting Python, C++, and Java, used to extract insights from images and videos using deep learning.

pip install opencv-contrib-python

Installs the main OpenCV package with contribution modules, along with NumPy for scientific computing.

What does cv.imread() do?

Reads an image from the specified path into a matrix of pixels.

Function of cv.imshow()

Displays an image in a new window, given a window name and the image matrix.

Signup and view all the flashcards

What does capture.read() do?

Reads the video frame by frame, returning the frame and a boolean indicating success.

Signup and view all the flashcards

What is Rescaling?

Modifies the height and width of a video or image, often to reduce computational strain.

Signup and view all the flashcards

Creating a blank image

Creates a blank image (NumPy array) with specified dimensions and color channels.

Signup and view all the flashcards

Drawing shapes on images

Draws a filled shape on an image using specified parameters like start point, end point, color, and thickness.

Signup and view all the flashcards

Converting to grayscale

Converts an image to grayscale, showing pixel intensity distribution rather than color.

Signup and view all the flashcards

What is Translation?

Shifts an image along the x and y axes using a translation matrix and cv.warpAffine.

Signup and view all the flashcards

Study Notes

Course Overview

  • Introduction to Python and OpenCV for image and video processing.
  • Covers basics like reading/manipulating media files, image transformations, drawing shapes, and adding text.
  • Progresses to advanced topics: color spaces, bitwise operations, masking, histograms, edge detection, and thresholding.
  • Concludes with face detection, face recognition, and a deep computer vision model for classifying Simpsons characters.
  • All materials will be available on GitHub.

Introduction to OpenCV

  • OpenCV is a computer vision library available in Python, C++, and Java.
  • Computer vision uses deep learning to extract insights from images and videos.
  • Python 3.7 or higher is required.
  • NumPy is a scientific computing package in Python used for matrix and array manipulations.

Package Installation

  • pip install opencv-contrib-python: installs the main OpenCV package with contribution modules.
  • NumPy is automatically installed alongside OpenCV, and facilitates scientific computing, especially for matrix and array related tasks.
  • pip install seer: installs a package of utility functions designed to speed up computer vision workflows.
  • Its usage is later in the course, specifically towards building the deep computer vision model.

Reading Images in OpenCV

  • Use cv.imread() to read an image: img = cv.imread('photos/cat.jpg').
  • Takes the image path as input and returns the image as a matrix of pixels.
  • Use cv.imshow() to display an image in a new window.
  • Requires the window name and the image matrix as parameters: cv.imshow('Cat', img).
  • cv.waitKey(0) is a keyboard binding function that waits indefinitely for a key press.

Handling Large Images

  • OpenCV may struggle with displaying images larger than the monitor's resolution.
  • There isn't an inbuilt way to deal with this
  • Resizing and rescaling techniques can help mitigate this issue, which are covered in the next video.

Reading Videos in OpenCV

  • Read videos using cv.VideoCapture(): capture = cv.VideoCapture('videos/dog.mp4').
  • Accepts an integer (0, 1, 2, etc.) for webcams or the video file path.
  • 0 typically references the default webcam.
  • Reading videos requires a loop to process frames sequentially.

Video Processing Loop

  • capture.read() reads the video frame by frame, returning the frame and a boolean indicating success.
  • cv.imshow() displays each frame in a window: cv.imshow('Video', frame).
  • cv.waitKey(20) waits 20 milliseconds for a key press; 0xFF == ord('d') checks if the 'd' key is pressed to break the loop.
  • capture.release() releases the capture device.
  • cv.destroyAllWindows() closes all OpenCV windows.

Error Handling

  • A negative 215 assertion failed error usually indicates that OpenCV can't find the specified media file.
  • This error can occur when a video runs out of frames or if an incorrect file path is provided.

Resizing and Rescaling Frames

  • Resizing and rescaling reduce computational strain associated with large media files.
  • Rescaling involves modifying the height and width of a video or image.
  • It's generally recommended to downscale to values smaller than the original dimensions.

Rescaling Function

  • rescale_frame(frame, scale=0.75) rescales a frame by a given factor.
  • Calculates new width and height based on the scale factor: width = int(frame.shape[1] * scale) and height = int(frame.shape[0] * scale).
  • Uses cv.resize() to resize the frame to the new dimensions, with cv.INTER_AREA interpolation, dimensions = (width, height).

Alternative Resizing Method

  • capture.set(propertyId, value) alters video capture properties.
  • Property IDs 3 and 4 correspond to width and height, respectively.
  • change_res(width, height) sets width and height using capture.set(3, width) and capture.set(4, height).
  • Only works for live video feeds.

Drawing on Images

  • Use np.zeros() to create a blank image: blank = np.zeros((500, 500, 3), dtype='uint8').
  • (500, 500) is the image size, 3 represents the color channels (BGR), and uint8 is the data type.
  • Painting an image involves setting pixel values: blank[:] = 0, 255, 0 paints the entire image green.
  • Specifying a range of pixels allows coloring specific regions: blank[200:300, 300:400] = 0, 0, 255 creates a red square.

Drawing a Rectangle

  • Use cv.rectangle() to draw a rectangle: cv.rectangle(blank, (0, 0), (250, 250), (0, 255, 0), thickness=2).
  • Parameters include the image, starting point, ending point, color, and thickness.
  • To fill the rectangle, set thickness=cv.FILLED or thickness=-1.
  • The starting point is the top-left corner, the ending point is the bottom-right corner.
  • Use img.shape[1] // 2 and img.shape[0] // 2 to draw a rectangle in the center.

Drawing Shapes and Text on Images with OpenCV

  • cv.rectangle draws rectangles, requiring the image, two points (top-left and bottom-right corners), color in BGR format, and thickness.
  • Negative thickness fills the rectangle.
  • cv.circle draws circles, needing the image, center coordinates, radius, color, and thickness.
  • A negative thickness fills the circle.
  • cv.line draws lines, requiring the image, start point, end point, color, and thickness.
  • cv.putText writes text on images, taking the image, text, origin (bottom-left corner of the text), font face (e.g., cv.FONT_HERSHEY_TRIPLEX), font scale, color, and thickness.
  • Origin specifies the starting point for the text.
  • OpenCV has built-in fonts like Hershey Triplex.
  • Adjust the origin or margins to handle text that goes off-screen.

Basic Functions in OpenCV

  • Converting an image to grayscale is done using cv.cvtColor with the cv.COLOR_BGR2GRAY color code. It shows intensity distribution of pixels rather than color.
  • Blurring an image reduces noise, and Gaussian Blur is a common technique in computer vision.
  • cv.GaussianBlur requires the image, a kernel size (odd number tuple like (3, 3) or (7, 7)), and cv.BORDER_DEFAULT.
  • A larger kernel size increases blurring.
  • Edge cascade identifies edges; the Canny edge detector is a multi-step process with blurring and gradient computations.
  • cv.Canny takes the image and two threshold values as arguments.
  • Blurring the image before finding edges can reduce the number of edges found.
  • Dilation expands edges using cv.dilate, taking the structuring element (like Canny edges), a kernel size, and the number of iterations.
  • Erosion shrinks dilated edges using cv.erode, requiring dilated image, kernel size, and iterations; it can restore the original edge cascade if parameters match dilation.
  • Resizing images uses cv.resize with the image and destination size.
  • Default interpolation is cv.INTER_AREA (good for shrinking).
  • Use cv.INTER_LINEAR or cv.INTER_CUBIC for enlarging images and achieving better quality, cv.INTER_CUBIC is slowest but yields better images.
  • Cropping images uses array slicing on pixel values in the image array.

Image Transformations in OpenCV

  • Translation shifts an image along the x and y axes. Shifting the image up, down, left, right, or do a combination.
  • Requires constructing a translation matrix using NumPy
  • Negative x values translate the image to the left, and negative y values translate up.
  • Positive x values translate the image to the right, and positive y values translate down.
  • Use cv.warpAffine to apply this transformation
  • Rotation rotates an image around a specified point.
  • cv.getRotationMatrix2D creates the rotation matrix, taking the rotation point, angle, and scale.
  • Rotation point defaults to the image center if no rotation point is specified
  • cv.warpAffine applies the rotation.
  • Positive angles rotate images counterclockwise, while negative rotate clockwise.
  • Resizing adjusts the dimensions of image to enlarge or reduce.
  • cv.resize requires the image and the desired destination size.
  • Interpolation method (cv.INTER_AREA, cv.INTER_LINEAR, cv.INTER_CUBIC) affects the final image quality and cv.INTER_CUBIC produces best quality for enlarging.
  • Cropping selects a portion of the image using array slicing based on pixel values, allowing to extract regions based on pixel ranges.

Image Enlarging

  • Use inter linear, or the dansko cubic to enlarge images.
  • Dansk cubic is slower, but the resulting image is better with high quality.

Flipping Images

  • Requires the cv.flip function.
  • Requires an image and a flip code as input.
  • Flip code options:
    • 0: Flips the image vertically (over the x-axis).
    • 1: Flips the image horizontally (over the y-axis).
    • -1: Flips the image both vertically and horizontally.

Cropping Images

  • Achieved through array slicing.
  • Specify the pixel ranges for height and width to define the cropped region.

Image Transformations Covered

  • Translation
  • Rotation
  • Resizing
  • Flipping
  • Cropping

Contours

  • Contours are basically the boundaries of objects.
  • A line or curve that joins the continuous points along the boundary of an object
  • Contours are useful tools in shape analysis, object detection, and recognition.
  • From a mathematical point of view, contours and edges are different things.

Finding Contours

  • Involves converting the image to grayscale.
  • Grabbing the edges of the image using the canny edge detector.
  • Using the findContours method.
  • The findContours method returns contours and heirarchies.
  • The CBO fund contours method looks at the structuring element for edges and returns two values
  • Contours is essentially a python list of all the coordinates of the contours that were found in the image
  • Hierarchies refers to the hierarchical representation of contours
  • Reto external retrieves only the external contours to all the ones on the outside, it returns those.
  • Reto tree returns all the hierarchical contours
  • CV dot chain approx none method does nothing and returns the contours
  • Chain approx simple compresses all the quantities that are returned in the simple ones that make most sense
  • Blurring the image before finding edges reduces the number of contours detected.

Threshold

  • Another function to find contours instead of Kenny.
  • Threshold essentially looks at an image and tries to binarize that image
  • if a particular pixel is below 125, if the density of that pixel is below 125, it's going to be set to zero or blank
  • If it is above 125, it is set to white or two by five
  • Thresholding attempts to binarize an image, take an image and convert it into binary form that is either zero or black, or white, or to Vi five

Visualizing Contours

  • Contours can be visualized by drawing over the image.
  • Use the cv.drawContours method for this.
  • cv.drawContours takes the image to draw over, the contours (as a list), a contour index (negative one to draw all), color, and thickness as inputs.

Color Spaces

  • A space of colors, a system of representing an array of pixel colors.
  • RGB is a kind of space, grayscale is color space, and there are also other color spaces like HSV, lamb and many more.

Converting to Grayscale

  • The way to do it is by saying grey is equal to CV dot CBT color.
  • Pass in the image and specify a color code, which is CV dot color, underscore BGR to grip since we're converting from a BGR image format to grayscale format.

HSV

  • HSV is Hue Saturation Value and is kind of based on how humans think and conceive of color
  • Convert from BGR to HSV by saying HSV is equal to CV dot CBT color, pass in the IMG variable, and specify a color code which is CV dot color, undergo BGR to HSV.

LAB

  • The lb is equal to CV dot CVT color, we pass the MG and the color on the scope of BGR.to AB

BGR vs RGB

  • OpenCV reads images in BGR format (Blue, Green, Red).
  • Outside of OpenCV, the RGB format is more commonly used.
  • Displaying a BGR image in a library expecting RGB can lead to color inversions.

Converting Between BGR and RGB

  • Use cv.cvtColor with the appropriate color code (cv.COLOR_BGR2RGB or cv.COLOR_RGB2BGR).
  • Be mindful of color inversions when working with different libraries.

Color Space Conversion Limitations

  • Grayscale images cannot be directly converted to HSV.
  • Convert grayscale to BGR first, then BGR to HSV

HSV to BGR Conversion

  • Converting from HSV to BGR uses CV dot CVT color in OpenCV.
  • The color code used for the conversion is color on Cisco HSV, two BGR.

Color Space Conversions

  • Color spaces from BGR can be converted to grayscale, HSV, LGB, and RGB.
  • There's no direct method to convert grayscale to LAB.
  • To convert grayscale to LAB, convert grayscale to BGR first, then BGR to LAB.

Splitting and Merging Color Channels

  • A color image consists of multiple channels (red, green, and blue in BGR or RGB images).
  • OpenCV allows splitting an image into its respective color channels (e.g., BGR into blue, green, and red components).
  • CV dot split splits an image into blue, green, and red channels.
  • The shapes of the blue, green, and red components do not show a three because the shape of that component is one.
  • Components are displayed as grayscale images showing pixel intensity distribution.
  • Lighter regions show a higher concentration of pixel values.
  • Darker regions represent little or no pixels in that region.
  • The additional element in the tuple represents the number of color channels.
  • Grayscale images have a shape of one.
  • CV dot merge merges color channels together by passing in a list of the channels.
  • To display the actual color in each channel, reconstruct the image using NumPy to create a blank image with the height and width, but not the number of colour channels.
  • Red, green, and blue can be merged to get back the original image.

Smoothing and Blurring Techniques

  • Smoothing and blurring are used to reduce noise in images caused by camera sensors or lighting issues.
  • A kernel or window is drawn over an image, and something happens to the pixels in this window.
  • Kernel size is the number of rows and columns in the kernel.
  • Blur is applied to the middle pixel as a result of the surrounding pixels..

Averaging Blur

  • Averaging blur computes the pixel intensity of the center pixel as the average of the surrounding pixel intensities.
  • CV dot blur is used to apply averaging blur.
  • The source image and kernel size are defined
  • Higher kernel sizes result in more blur.

Gaussian Blur

  • Gaussian blur is similar to averaging, but each surrounding pixel is given a particular weight.
  • The average of the products of those weights gives you the value for the true center.
  • Gaussian blur results in less blurring than averaging.
  • Considered more natural than averaging
  • CV dot Gaussian Blur is the code used to apply Gaussian Blur
  • Needs the source image, kernel size and sigma x or standard deviation to be defined

Median Blur

  • Median blurring finds the median of the surrounding pixels instead of the average.
  • More effective in reducing noise, especially salt and pepper noise.
  • Commonly used in advanced computer vision projects needing substantial noise reduction.
  • It is important not to use high kernel sizes as you end up with a "washed up smudged version" of the image.
  • CV dot median blue is the relevant code
  • The kernal size is passed in an integer as it automatically takes into account the hight and width

Bilateral Blurring

  • Bilateral blurring applies blurring while retaining edges in the image.
  • It does this be looking at whether the blurring is reducing edges or not.
  • CV dot bilateral filter is used to apply bilateral blurring.
  • Parameters include the image, diameter of the pixel neighborhood, sigmaColor (color sigma), and sigmaSpace (space sigma).
  • sigmaColor: A larger value means more colors in the neighborhood are considered.
  • sigmaSpace: Larger values mean pixels further from the center influence the calculation.
  • Higher sigmaSpace values can cause the image to look blurred and smudged.
  • The diameter is the diameter of the pixel neighborhood not a kernel size.

Bitwise Operators

  • Four basic bitwise operators: AND, OR, XOR, and NOT.
  • Used in image processing, especially with masks.
  • Operate in a binary manner: a pixel is turned off if it has a value of zero, and is turned on if it has a value of one.

Studying That Suits You

Use AI to generate personalized quizzes and flashcards to suit your learning preferences.

Quiz Team

More Like This

Use Quizgecko on...
Browser
Browser