Object Detection PDF
Document Details
Uploaded by ModernNeptunium
Nanyang Polytechnic
Tags
Summary
This document is a presentation on object detection. It explains the concepts of object detection and how it differs from image classification. It also covers bounding boxes, the variable P, and the calculation of Intersection over Union (IoU).
Full Transcript
Object Detection Official (Open) Object Detection Object detection is a computer vision technique for locating instances of objects in images or videos. Humans can easily detect and identify objects present in an image. Now l...
Object Detection Official (Open) Object Detection Object detection is a computer vision technique for locating instances of objects in images or videos. Humans can easily detect and identify objects present in an image. Now let’s simplify this statement a bit with the help of the below image. Official (Open) Object Detection So instead of classifying, which type of dog is present in these images, we have to actually locate a dog in the image. That is, I have to find out where is the dog present in the image? Is it at the center or at the bottom left? And so on. Now the next question comes into the human mind, how can we do that? So let’s start. Well, we can create a box around the dog that is present in the image and specify the x and y coordinates of this box. Official (Open) Object Detection for now, consider that the location of the object in the image can be represented as coordinates of these boxes. So this box around the object in the image is formally known as a bounding box. Now, this becomes an image localization problem where we are given a set of images and we have to identify where is the object present in the image. Note that here we have a single class. what if we have multiple classes? here is an example, Official (Open) Image Classification vs Object Detection In the case of object detection problems, we have to classify the objects in the image and also locate where these objects are present in the image. But the image classification problem had only one task where we had to classify the objects in the image. Official (Open) Image Classification vs Object Detection So for the one example, you can see below the image that we have been working on in the first case, we predict only the target class, and such tasks are known as image classification problems. While in the second case, along with predicting the target class, we also have to find the bounding box which denotes the location of the object. This is all. Official (Open) Object Détection This is all about the object detection problem. So broadly we have three tasks for object detection problems: 1. To identify if there is an object present in the image, 2. where is this object located, 3. what is this object? So you can see the below image. Official (Open) Single-Class Object Détection Official (Open) Multi-Objects in an Image? Another problem could be where you are provided with multiple images, and within each of these images, you have multiple objects. Also, these objects can be of the same class, or another problem can be that these objects are of different classes. Official (Open) Review In the last section, we discussed: object detection problem how it is different from a classification problem. the three tasks for an object detection problem. Official (Open) Object Detection Task In object detection, what the data do we need. Official (Open) What the data do we need? So, in this case, this target variable has five values the value p denotes the probability of an object being in the above image whereas the four values Xmin, Ymin, Xmax, and Ymax denote the coordinates of the bounding box. How are these coordinate values calculated? Official (Open) How are bounding box coordinate calculated? So, consider the x-axis and y-axis above the image there. In that case, the Xmin and Ymin will be the top left corner of the bounding box, and Xmax and Ymax will be the bottom right corner of the bounding box. Official (Open) What is the variable P? Now, note that the target variable(P) answers only two questions? 1. Is there an object present in the image? ◦ Answer:- If an object is not present then p will be zero and when there is an object present in the image p will be one. 2. if an object is present in the image where is the object located? ◦ Answer:- You can find the object location using the coordinates of the bounding box. In case all the images have a single class that is just a car. What happens when there are more classes? In that case, this is what the target variable would look like. Official (Open) More than 1 class? Assume you need two classes: emergency vehicle non-emergency vehicle Two additional values c1 and c2 So if we consider this example, we have the probability of an object present in the image as one. We have the given Xmin, Ymin, Xmax, and Ymax as the coordinates of the bounding box. And then we have c1 is equal to 1 since this is an emergency vehicle and c2 would be 0 because of a non-emergency vehicle. Official (Open) Possible outcome Official (Open) Review Before moving into depth, we need to know a few concepts regarding images such that: 1. How to do Bounding Box Evaluation? 2. How to calculate IoU? 3. Evaluation Metric – mean Average Precision Let’s start with the first one is Bounding Box Evaluation. Note: IoU is Intersection over Union Official (Open) More on Bounding Box Official (Open) What is a bounding box? a rectangle that surrounds an object, that specifies its position, class(eg: car, person) and confidence(how likely it is to be at that location) Bounding boxes are mainly used in the task of object detection, where the aim is identifying the position and type of multiple objects in the image. For example, if you look at the image on the right, the green rectangle is a bounding box that describes where in the image, the car lies. Official (Open) Conventions used in specifying a bounding box There are 2 main conventions followed when representing bounding boxes: 1. Specifying the box with respect to the coordinates of its top left, and the bottom right point. 2. Specifying the box with respect to its center, and its width and height. Bounding box specified with respect to its top left and bottom right points Bounding box specified with respect to its center coordinates Official (Open) Parameters used to define a bounding box: Depending on the convention followed, here are the main parameters that specify a bounding box: 1. Class: What is the object inside the box. Eg car, truck, person etc 2. (x1, y1): Corresponds to the x and y coordinate of the top left corner of the rectangle. 3. (x2, y2): Corresponds to the x and y coordinate of the bottom right corner of the rectangle. 4. (xc, yc): Corresponds to the x and y coordinate of the center of the bounding box. 5. Width: Represents the width of the bounding box. 6. Height: Represents the height of the bounding box. 7. Confidence: Indicates how likely the object is actually present in that box. Eg a confidence of 0.9 would indicate that there is a 90% chance that object actually exists in that box. Official (Open) Converting between the conventions: We can convert between the different forms of representing the bounding box, depending on our use case. 1. Xc = ( x 1 + x 2 ) / 2 2. Yc = ( y1 + y2 ) / 2 3. width = ( x2 — x1) 4. height = (y2 — y1) Official (Open) Code (How to define a bounding box, an example) def box_corner_to_center(boxes): x1, y1, x2, y2 = boxes[:, 0], boxes[:, 1], boxes[:, 2], boxes[:, 3] cx = (x1 + x2) / 2 cy = (y1 + y2) / 2 w = x2 - x1 h = y2 - y1 boxes = torch.stack((cx, cy, w, h), axis=-1) return boxes Official (Open) Bounding Box Evaluation Official (Open) Intersection over Union (IoU) intersection over the union(IoU) determine the target variable Official (Open) Intersection over Union (IoU) Official (Open) Intersection over Union (IoU) Box 2 Official (Open) Scenarios:- 1 Let’s consider another example suppose we have created multiple bounding boxes or patches of different sizes. Official (Open) Which bounding box is more accurate? (With formula) Official (Open) Mathematical formula for IoU Official (Open) Scenario:- 1 Now, what would be the range of intersection? Let’s consider some extreme scenarios. Official (Open) Scenario:- 2 Another possible scenario could be when both the predicted bounding box and the actual bounding box completely overlap. Official (Open) Practically Official (Open) Calculation IOU Official (Open) Area of intersection Official (Open) Area of union Official (Open) Calculation Official (Open) Note to the tutor It is not a simple function, and the coding will be too much to put in the presentation slide. Please let the students go and get the lab7lab.zip file. There is a document files with the example codes. Please get the students to go through the code and change the parameter/play with the code. Please remind them that they will be tested on the code.