Computer Vision - Lecture 1 - Introduction Overview PDF
Document Details
Dr. Heba Hamdy
Tags
Summary
Computer vision is a field of artificial intelligence, enabling computers to derive meaningful information from visual inputs like images and videos. Lecture notes discuss computer vision versus computer graphics, image processing techniques (like color depth and image categories) and related tasks. The material is suitable for an undergraduate course in computer science or a related field.
Full Transcript
COMPUTER VISION DR. HEBA HAMDY WHAT IS COMPUTER VISION? Computer vision is a field of artificial intelligence (AI) that enables computers and systems to derive meaningful information from digital images, videos and other visual inputs — and take actions or make recommendations...
COMPUTER VISION DR. HEBA HAMDY WHAT IS COMPUTER VISION? Computer vision is a field of artificial intelligence (AI) that enables computers and systems to derive meaningful information from digital images, videos and other visual inputs — and take actions or make recommendations based on that information. If AI enables computers to think, computer vision enables them to see, observe and understand. TEACHING & ASSESSMENT PLAN Week Topic 1 Introduction To Computer Vision: - What is computer vision? - Computer Vision vs. Computer Graphics - What about Image Processing? - How does computer vision work? - Computer Vision Pipeline - computer vision applications - computer vision tasks 2 Review on image processing. 2D and 3D Cartesian Coordinates - Right- and Left-handed 3D Coordinate Systems - Image Coordinate System Digital Images - Color Depth - Image Categories - Binary Images - Grayscale Images - Color Images and RGB Color Space - Assignment: Reading Assignment: Basics image processing Week Topic Filters Linear filters: - Uniform (mean) filter - Triangular filter - Gaussian filter - Non-linear filters: - Median filter A Feature Detectors - Edge Detectors - Roberts Detector - Sobel Detector - Compass Detector - Canny Detector - Line Detector Week Topic Logical & Morphological Operations Image Logic Operations - AND/OR - AND, OR, NOT (COMPLEMENT) - XOR (exclusive OR), NAND (NOT-AND) Image morphology - Erosion/ Dilation - Opening - Closing - Applications on Morphological operations Image Segmentation - Pixel based segmentation (group pixels together into regions of similarity) - Thresholding - Region splitting - Region growing (merging) - Split and merge. - Color Segmentation using K-means clustering. Reading Assignment: Segmentation - Corners detectors - Harris/Plessy Corner Detector - SIFT Detectors Midterm Exam (Theoretical & Practical) Week Topic Feature Matching Distance Metrics - Euclidean 10 - Manhattan (City Block) - Chessboard Similarity Measures Or Correlation Techniques (Or Functions) Include: - Take Home Assignments: Appling feature Similarity Measures Machine learning - Supervised Learning - Naïve Bayes classification 11,12 - k NEAREST NEIGHBOR - Unsupervised Learning Reading Assignment: Apply Classification 13 General Revision for Final Exam 14 Final Exam (Theoretical) GRADING POLICY Course Activity Points Quizzes (Practical) 10% Assignments 10% Project 10% Midterm Exam (Theoretical) 30% Final Exam (Theoretical ( 40% Total** 100% COMPUTER VISION VS. COMPUTER GRAPHICS Computer vision and computer graphics: are they the same field? Computer vision is the field of science that is concerned with recognizing environments through a set of images (to help machines see). Input: images. Output: description (e.G., Locations of objects, dimensions, etc.) The ultimate goal of computer vision is to simulate the vision system in humans. COMPUTER VISION: AN EXAMPLE COMPUTER VISION VS. COMPUTER GRAPHICS Computer graphics is the field of science that is concerned with generating visual images synthetically from descriptive information. Input: description (e.G., Locations of objects, dimensions, etc.) Output: images. Computer graphics is the opposite operation of computer vision. COMPUTER GRAPHICS: AN EXAMPLE WHAT ABOUT IMAGE PROCESSING? Image processing is the field of science that is concerned with manipulating images. Input: images. Output: images. Most of the time, image processing techniques are used as primary steps in vision systems (e.G., To enhance the quality of images). Sometimes it is hard to draw the line between computer vision and image processing. So some topics may be categorized in both fields. IMAGE PROCESSING: AN EXAMPLE HOW DOES COMPUTER VISION WORK? Computer vision needs lots of data. It runs analyses of data over and over until it discerns distinctions and ultimately recognize images. For example, to train a computer to recognize automobile tires, it needs to be fed vast quantities of tire images and tire-related items to learn the differences and recognize a tire, especially one with no defects. Machine learning uses algorithmic models that enable a computer to teach itself about the context of visual data. If enough data is fed through the model, the computer will “look” at the data and teach itself to tell one image from another. Algorithms enable the machine to learn by itself, rather than someone programming it to recognize an image. COMPUTER VISION PIPELINE There are typical STEPS which are found in many computer vision systems Image acquisition: a digital image is produced by one or several image sensors (e.G., Light-sensitive cameras, radar, ultra-sonic cameras, etc.) COMPUTER VISION PIPELINE Pre-processing: it is usually necessary to process the image in order to assure that it satisfies certain assumptions before applying a computer vision method to it. For example, noise may need to be reduced or contrast may need to be enhanced. COMPUTER VISION PIPELINE Detection/segmentation: we may need to determine which image points or regions of the image are relevant for further processing. For example, we may select a specific set of interest points or segment an image region that contains an object of interest. The position of the image segmentation phase is not fixed in the CV pipeline. It might be a part of the pre-processing phase or follow it (as in our pipeline) or be part of the feature extraction and selection phase or follow them. Segmentation is one of the oldest problems in cv and has the following aspects: ✓ partitioning an image into regions of similarity. ✓ Grouping pixels and features with similar characteristics together. ✓ Helps with selecting regions of interest within the images. These regions can contain objects of interest that we want to capture. ✓ Segmenting an image into foreground and background to apply further processing on the foreground. COMPUTER VISION PIPELINE Feature extraction: image features are extracted. (A feature is some measurable characteristic of the input which has been found to be useful for recognition.) Examples of features are edges, corners, etc. lines and edges: these features are where sharp changes in brightness occur. They represent the boundaries of objects. Corners: these features are points of interest in the image where intersections or changes in brightness happens. These corners and edges represent points and regions of interest in the image. Brightness in this context refers to changes the in pixel intensity value. COMPUTER VISION PIPELINE Some extracted features might be irrelevant or redundant. After feature extraction comes feature selection. Feature selection is choosing a feature subset that can reduce dimensionality with the least amount of information loss. COMPUTER VISION PIPELINE High-level processing: from the previous step, we may use a small set of points or an image region that is assumed to be associated with a specific object. This info may be used to classify objects into categories or estimate their sizes, etc. More processing is done on the segmented images to identify more features from the image. Example: after segmentation to partition a face region, identify features on the face, such as hair style, age, and gender. computer vision applications manufacturing: ensure that products are being positioned correctly on an assembly line. Visual auditing: look for visual compliance or deterioration in a fleet of trucks, planes, windmills, transmission or power towers , and so on. Insurance: classify claims images into different categories. Medical image processing: detect tumors. Automotive industry: object detection for safety. For example, while parking a car, a camera can detect objects and warn the driver when they get too close to them. computer vision applications social commerce: use an image of a house to find similar homes that are for sale. social listening: track the buzz about your company on social media by looking for product logos. retail: use the photo of an item to find its price at different stores. education: use pictures to find educational material on similar subjects. public safety: automated license-plate reading. computer vision tasks object detection and recognition: detect certain patterns within the image. Examples: ✓ detecting red eyes when taking photos in certain conditions. ✓ Face recognition. Content-based image retrieval: image retrieval from a database based on user’s image query. ✓ By using image actual feature contents such as colors, shapes, and textures ✓ not using image metadata (keywords, tags, or descriptions) optical character recognition (OCR):converting hand-written text to a digital format. COORDINATES AND IMAGES CONTENTS (c) 2018, Dr. R. Elias Images 2 2D AND 3D CARTESIAN COORDINATES 2D Cartesian coordinate system 3D Cartesian coordinate system Images 2 6 RIGHT- AND LEFT-HANDED 3D COORDINATE SYSTEMS Images Right-handed system Left-handed system 2 7 DIGITAL IMAGES DIGITAL IMAGES Does a color have a depth?! Images 29 COLOR DEPTH Images 30 COLOR DEPTH: EXAMPLES Images 31 COLOR DEPTH: AN EXAMPLE Images 32 IMAGE CATEGORIES Images 33 BINARY IMAGES Images 34 GRAYSCALE IMAGES Images 35 COLOR IMAGES AND RGB COLOR SPACE Images 36 COLOR IMAGES AND RGB COLOR SPACE Images 37 24-BIT COLOR IMAGES Concept! Images 38 24-BIT COLOR IMAGES Concept! Images 39 8-BIT COLOR IMAGES Concept! Images 40 8-BIT COLOR IMAGES Concept! Images 41