COMPUTER VISION_THEORY.pdf
Document Details
Uploaded by Deleted User
Tags
Full Transcript
UNIT 5 COMPUTER VISION OBJECTIVE 1.Define the concept of Computer Vision and understand its applications in various fields 2. Understand the basic concepts of image representation, feature extraction, object detection, and segmentation. The Computer Vision domain of Artificial...
UNIT 5 COMPUTER VISION OBJECTIVE 1.Define the concept of Computer Vision and understand its applications in various fields 2. Understand the basic concepts of image representation, feature extraction, object detection, and segmentation. The Computer Vision domain of Artificial Intelligence, enables machines to see through images or visual data, process and analyse them on the basis of algorithms and methods in order to analyse actual phenomena with images. Applications of Computer Vision Facial Recognition Computer vision is essential to the advancement of the home in the era of smart cities and smart homes. The most crucial application of computer vision is facial recognition in security. Either visitor identification or visitor log upkeep is possible. Face Filters Many of the functionality in today’s apps, including Instagram and Snapchat, rely on computer vision. One of them is the usage of facial filters. The computer or algorithm may recognise a person’s facial dynamics through the camera and apply the chosen facial filter. Applications of Computer Vision Google’s Search by Image The majority of data that is searched for using Google’s search engine is textual information, but it also has the intriguing option of returning search results via an image. This makes use of computer vision since it examines numerous attributes of the input image while also comparing them to those in the database of images to provide the search result. Computer Vision in Retail One of the industries with the quickest growth is retail, which is also utilising computer vision to improve the user experience. Retailers can analyse navigational routes, find walking patterns, and track customer movements through stores using computer vision techniques. Self-Driving Cars Vision is the fundamental technology behind developing autonomous vehicles. Most leading car manufacturers in the world are reaping the benefits of investing in artificial intelligence for developing on-road versions of hands-free technology. This involves the process of identifying the objects, getting navigational routes and also at the same time environment monitoring Applications of Computer Vision Medical Imaging A reliable resource for doctors over the past few decades has been computer-supported medical imaging software. It doesn’t just produce and analyse images; it also works as a doctor’s helper to aid in interpretation. The software is used to interpret and transform 2D scan photos into interactive 3D models that give medical professionals a thorough insight of a patient’s health. Google Translate App To read signs written in a foreign language, all you have to do is point the camera on your phone at the text, and the Google Translate software will very immediately translate them into the language of your choice. This is a useful application that makes use of Computer Vision, utilising optical character recognition to view the image and augmented reality to overlay an accurate translation. Computer Vision Tasks The various applications of Computer Vision are based on a certain number of tasks which are performed to get certain information from the input image which can be directly used for prediction or forms the base for further analysis. The tasks used in a computer vision application are : Computer Vision Tasks Classification Image: Classification problem is the task of assigning an input image one label from a fixed set of categories. This is one of the core problems in CV that, despite its simplicity, has a large variety of practical applications. Classification + Localisation :This is the task which involves both processes of identifying what object is present in the image and at the same time identifying at what location that object is present in that image. It is used only for single objects. Object Detection: Object detection is the process of finding instances of real-world objects such as faces, bicycles, and buildings in images or videos. Object detection algorithms typically use extracted features and learning algorithms to recognize instances of an object category. It is commonly used in applications such as image retrieval and automated vehicle parking systems. Instance Segmentation :Instance Segmentation is the process of detecting instances of the objects, giving them a category and then giving each pixel a label on the basis of that. A segmentation algorithm takes an image as input and outputs a collection of regions (or segments). Basics of Images We all see a lot of images around us and use them daily either through our mobile phones or computer system. But do we ask some basic questions to ourselves while we use them on such a regular basis. Basics of Pixels The word “pixel” means a picture element. Every photograph, in digital form, is made up of pixels. They are the smallest unit of information that make up a picture. Usually round or square, they are typically arranged in a 2-dimensional grid. In the image below, one portion has been magnified many times over so that you can see its individual composition in pixels. As you can see, the pixels approximate the actual image. The more pixels you have, the more closely the image resembles the original. Resolution The resolution of an image is occasionally referred to as the number of pixels. One approach is to define resolution as the width divided by the height when the phrase is used to describe the number of pixels, for example, a monitor resolution of 1280×1024. Accordingly, there are 1280 pixels from side to side and 1024 pixels from top to bottom. Pixel value Each of the pixels that make up an image that is stored on a computer has a pixel value that specifies its brightness and/or intended colour. The byte image, which stores this number as an 8-bit integer with a possible range of values from 0 to 255, is the most popular pixel format. Zero is typically used to represent no colour or black, and 255 is used to represent full colour or white. Grayscale Images Grayscale images are images which have a range of shades of gray without apparent colour. The darkest possible shade is black, which is the total absence of colour or zero value of pixel. The lightest possible shade is white, which is the total presence of colour or 255 value of a pixel. Intermediate shades of gray are represented by equal brightness levels of the three primary colours. Here is an example of a grayscale image. as you check, the value of pixels are within the range of 0- 255.The computers store the images we see in the form of these numbers. RGB Images Every image we encounter is a coloured image. Three main colors—Red, Green, and Blue—make up these graphics. Red, green, and blue can be combined in various intensities to create all the colours that are visible. Let us experience! Go to this online link https://www.w3schools.com/colors/colors_rgb.asp. On the basis of this online tool, try and answer all the below mentioned questions. 1) What is the output colour when you put R=G=B=255 ? ____________________________________________________________________________________________________ 2) What is the output colour when you put R=G=B=0 ? ____________________________________________________________________________________________________ 3) How does the colour vary when you put either of the three as 0 and then keep on varying the other two? ____________________________________________________________________________________________________ ____________________________________________________________________________________________________ ____________________________________________________________________________________________________ 4) How does the output colour change when all the three colours are varied in same proportion ? ____________________________________________________________________________________________________ ____________________________________________________________________________________________________ ____________________________________________________________________________________________________ 5) What is the RGB value of your favourite colour from the colour palette? ____________________________________________________________________________________________________ How do computers store RGB images? Every RGB image is stored in the form of three different channels called the R channel, G channel and the B channel. Each plane separately has a number of pixels with each pixel value varying from 0 to 255. All the three planes when combined together form a colour image. This means that in a RGB image, each pixel has a set of three different values which together give colour to that particular pixel. For example As you can see, each colour image is stored in the form of three different channels, each having different intensity. All three channels combine together to form a colour we see. In the above given image, if we split the image into three different channels, namely Red (R), Green (G) and Blue (B), the individual layers will have the following intensity of colours of the individual pixels. These individual layers when stored in the memory looks like the image on the extreme right. The images look in the grayscale image because each pixel has a value intensity of 0 to 255 and as studied earlier, 0 is considered as black or no presence of colour and 255 means white or full presence of colour. These three individual RGB values when combined together form the colour of each pixel. Therefore, each pixel in the RGB image has three values to form the complete colour.