Computer Vision Introduction

Podcast

Play an AI-generated podcast conversation about this lesson

Download our mobile app to listen on the go

Get App

Questions and Answers

Match the historical periods with their characteristic advancements in computer vision:

1960s = Interpretation of synthetic worlds 1980s = Shift toward geometry and increased mathematical rigor 1990s = Face recognition; statistical analysis 2010s = Resurgence of deep learning

Match each computer vision field with its corresponding mathematical foundation:

Radiometry = Physics of light measurement Optics = Behavior and properties of light Sensor Design = Engineering of imaging devices Computer Graphics = Modeling of objects and animation

Match the stage of vision with its description

Scene = The external environment Image Acquisition = The capture of visual information Perception = The extraction of meaning from visual data Image Interpretation = A computer mimicking people's results from analysing inputs of imagery.

Match the descriptions to the components in human anatomy:

Retina = Comparable to film inside a camera; consists of nerve tissue that senses light Macula Lutea = Area providing clearest vision Fovea Centralis = Area with a cone of photoreceptors Iris = Colored annulus with radial muscles Signup and view all the answers

Match types of animal eye

Rods = Sensitive to low light or dim light Cones = Sensitive to color Humans = Rely heavily on color vision Cats = Rely vision in the dark Signup and view all the answers

Which of the following phenomena are a 'life choice' for a photon?

Absorption = Photon's energy is transferred to atoms in a material Transparency = Photons pass through a material with little scattering Refraction = Light bends from one medium to another Phosphorescence = Energy is stored for longer time before re-emission. Signup and view all the answers

Match the color attribute to the corresponding term.

Hue = The mean of the color in the world Saturation = How mixed is the color Brightness = Amount of light Color on a monitor = Based on RGB Signup and view all the answers

Sort computer vision

Finding People in images = Finding object in images Reading License Plate = Optical character recognition (OCR) Face unlock on Apple iPhone X = Vision-based biometrics Robot playing soccer = Vision for robotics Signup and view all the answers

Match the types of pixel values.

Grid = Matrix of intensity values Pixels = Range between 0 and 255 Black = The pixel value of 0 White = The pixel value of 255 Signup and view all the answers

Match the terms with their properties.

3D = The world is 3D and Dynamic Camera = Cameras and computers are cheap Computer Vision System = Biological systems rely on computer vision system Image = image is worth 1000 words Signup and view all the answers

Match the descriptions to terminology:

Pinhole Camera = A simple camera model, Captures pencil of rays. all rays through a single point Center of Projection = Another word for 'focal point' Image Plane = Where the image is formed Camera Obscura = A camera invented in China. Signup and view all the answers

Match the distortion with the definition.

Perspective distortion = The angles is distorted Non-preservation of length = Size is inversely proportional to distance vanisihing point = Parallel lines in the world intersect in the image parallel = All Directions in the same Plane have vanishing Points on the same Line Signup and view all the answers

Match the descriptions to homogeneous coordinate concepts:

Homogeneous coordinates = Represent points in a projective space Use Cases = Makes calculations easier Advantage = Easier for computations Scaling = Scale Invariant Signup and view all the answers

Identify the main focus of the related disciplines.

Pattern recognition = Finding structure in the data Computer graphics = Modeling objects and animations Machine learning = Learning from data Projective geometry = Projecting and viewing geometry Signup and view all the answers

Match the type of Camera Parameters

Calibration = The step in the image aquisition pipeline Lenses = A camera Model Active research = Vision for robotics Pinhole Cameras = Camera to record Signup and view all the answers

Relate Computer Vision Concepts.

Sensitivity = sensitivity to Error is common in inverse problems. Vision Algorithms = often error prone and it is amazing that humans do this so effortlessly. Image processing = Related disciplines in computer vision Machine Learning = Machine Learning is very useful for Computer Vision Signup and view all the answers

Match the descriptions to Computer Vision applications.

Self-driving Vehicles = Driving point-to-point between cities as well as autonomous flight Motion capture = Using retro-reflective markers viewed from multiple cam eras Consumer-level applications = Turning overlapping photos into a single seamlessly stitched Visual authentication = Automatically logging family members onto your home com puter Signup and view all the answers

Match terms for Image filtering

Linear Filters and Convolution = Linear is Filters. Pyramids = is a technique of Image Filtering and Enhancing Edge Detection = Linear Filters and Convolution Image Smoothing = is a technique of Image Filtering and Enhancing Signup and view all the answers

Match the timeline year.

1970 = Digital image processing 1980 = Image pyramids 2000 = Face recognition and detection 2010 = Machine learning Signup and view all the answers

Match the computer vision concept.

Linear filters = is a common way to to process images, It used correlation and represented mathematically Gaussian = Powerful for object category Computer Vision = It is used to improve the Quality Images Machine Learning = Machine Learning is very useful for Computer Vision Signup and view all the answers

Match each term related to image filtering with its correct definition

Convolution = Process of multiplying pixel values by corresponding kernel weights Correlation = Process similar of Convolution but can be rotated and represented mathematically Impulse response function = Response of a system to a brief input signal Kernel = The filter with size K × K to require some Size to work Signup and view all the answers

Which of the following techniques are used to improve Image processing while also reducing processing strain

Reductions of blur = The reduce blurry the object Edge detections = the process images is progress Linear Filters and Convolution = is a common way to to process images, It used correlation and represented mathematically Separable filtering = Optimize convolution operations by breaking down into one-dimensional Signup and view all the answers

Match the computer vision milestones with the corresponding decade:

1970s = Progress in interpreting selected images and pictorial structures. 1990s = Focus on face recognition and statistical analysis. 2010s = Resurgence of deep learning and advanced architectures. 1960s = Interpretation of synthetic worlds and basic object recognition. Signup and view all the answers

Match the concepts to their descriptions in computer vision:

Inverse Problem = Recovering unknowns from insufficient information, making vision difficult. Forward Models = Developed in physics and computer graphics to describe image formation. 3D Modeling = Creating digital representations of environments from overlapping photographs. Stereo Matching = Creating dense 3D surface models from multiple views of an object. Signup and view all the answers

Match the following descriptions to the related fields of Computer Vision:

Computer Graphics = Focuses on creating images from 3D models. Machine Learning = Provides algorithms for pattern recognition and intelligent decisions from data. Digital Image Processing = Deals with image manipulation and enhancement at the pixel level. Computational Photography = Aims to enhance image capture and generation using computation. Signup and view all the answers

Match the following applications with their descriptions in Computer Vision:

Optical Character Recognition (OCR) = Technology to convert scanned documents or images of text into machine-readable text. Face Detection = Identifying and locating human faces in digital images. Motion Capture (Mocap) = Techniques to record and interpret movement, often used for animation. 3D Model Building (Photogrammetry) = Creating 3D models from 2D images, often aerial or drone photographs. Signup and view all the answers

Match the descriptions to the related terms in vision:

Vision = The process of discovering what is present and where it is by looking. Computer Vision = Analysis of pictures and videos to achieve human-like visual understanding. Image Acquisition = The process of capturing visual data, either by human eye or camera. Image Interpretation = The process of analyzing and understanding the acquired visual data. Signup and view all the answers

Match the parts of the Human Eye with their functions:

Retina = The innermost layer of the eye, comparable to camera film, sensing light. Iris = The colored part of the eye that controls the size of the pupil. Pupil = The aperture in the center of the iris that allows light to enter the eye. Lens = Focuses light onto the retina, enabling clear vision at different distances. Signup and view all the answers

Match the photoreceptor types with their characteristics:

Rods = Highly sensitive photoreceptors, responsible for vision in low light conditions and grayscale perception. Cones = Photoreceptors concentrated in the macula, responsible for color vision and high acuity in bright light. Macula Lutea = Yellowish central portion of the retina, area of clearest and most distinct vision. Fovea Centralis = Center of the macula, densely packed with cones, responsible for sharp central vision. Signup and view all the answers

Match the concepts related to the Electromagnetic Spectrum and Vision:

Visible Light = The portion of the electromagnetic spectrum that humans can see, crucial for computer vision. Wavelength = Determines the color of visible light, ranging approximately from 400nm to 700nm. Photoreceptors = Cells in the retina that are sensitive to light in the visible spectrum. Brightness Constancy = The visual system's attempt to discount illumination when interpreting colors. Signup and view all the answers

Match the concepts related to Camera Modeling:

Pinhole Camera = A simple camera model where all light rays pass through a single point. Focal Length = The distance between the lens (or pinhole) and the image plane, affecting field of view. Aperture = The opening in the lens that controls the amount of light entering the camera. Image Plane = The plane where the image is formed in a camera, corresponding to the film or sensor. Signup and view all the answers

Match the Projection Properties to their descriptions:

Many-to-one Mapping = Points along the same ray in 3D space project to the same point in the 2D image. Vanishing Point = The point in the image plane where parallel lines in 3D space appear to converge. Perspective Distortion = The effect where objects appear smaller as their distance from the camera increases. Projection Line = Lines in 3D space project to lines in the 2D image (unless passing through the focal point). Signup and view all the answers

Match the Camera Parameters with their categories:

Intrinsic Parameters = Parameters internal to the camera, like focal length and principal point. Extrinsic Parameters = Parameters describing the camera's position and orientation in the world. Rotation Matrix (R) = Describes the camera's orientation relative to the world coordinate system. Translation Vector (t) = Describes the camera's position in the world coordinate system. Signup and view all the answers

Match the types of Projection with their characteristics:

Perspective Projection = Objects appear smaller as they are farther away, mimicking human vision and standard cameras. Orthographic Projection = Parallel lines remain parallel, no perspective distortion, often used in technical drawings. Scaled Orthographic Projection = Approximation of perspective projection, suitable when object dimensions are small compared to distance. Affine Camera = Approximation where parallel lines remain parallel and ratios of lengths are preserved. Signup and view all the answers

Match the terms related to Camera Lenses:

Aperture = Controls the amount of light entering the camera and affects depth of field. Focal Length = Determines the field of view and magnification of the lens. Depth of Field = The range of distances in a scene that appear acceptably sharp in an image. Lens Aberrations = Imperfections in lenses that can cause distortions like chromatic and spherical aberration. Signup and view all the answers

Match the Photon's Life Choices with their descriptions:

Absorption = Photon's energy is transferred to the material, causing an increase in temperature. Diffuse Reflection = Light scatters in many directions from rough surfaces. Specular Reflection = Light bounces off smooth surfaces at the same angle. Refraction = Light bends as it passes from one medium to another due to a change in speed. Signup and view all the answers

Match the Color Descriptors with their Psychophysical Correspondence:

Hue = Corresponds to the mean wavelength of light, representing the central color perceived. Saturation = Corresponds to the variance of the light spectrum, indicating color purity. Brightness = Corresponds to the area under the light spectrum, representing the intensity of light. Color Constancy = The ability to perceive the 'true color' of a surface regardless of illumination changes. Signup and view all the answers

Match the Color Spaces with their characteristics:

RGB = Default color space for devices, easy to implement but not perceptually uniform. HSV = Intuitive color space based on Hue, Saturation, and Value, decoupling color and brightness. YCbCr = Color space used in TV and video compression, separating luminance and chrominance. Lab* = Perceptually uniform color space, designed to approximate human vision more accurately. Signup and view all the answers

Match the Image Filtering Types with their descriptions:

Linear Filtering = Applying a weighted sum of neighboring pixels to compute a new pixel value. Non-linear Filtering = Using operations that are not linear combinations of pixel values, like median filtering. Separable Filtering = Optimizing filtering by breaking down a 2D kernel into two 1D kernels. Gaussian Filtering = Using a Gaussian kernel for blurring and smoothing, with weights decreasing from the center. Signup and view all the answers

Match the Linear Filters with their applications:

Box Filter = Simple averaging filter for blurring, all weights are equal. Bilinear Filter = Smoothing filter with weights decreasing linearly from the center, better edge preservation than box filter. Sobel Filter = Edge detection filter emphasizing horizontal or vertical edges. Laplacian of Gaussian (LoG) = Filter for edge and blob detection, combining Gaussian smoothing and Laplacian edge detection. Signup and view all the answers

Match the Non-linear Filters with their characteristics:

Median Filter = Replaces each pixel value with the median value of its neighborhood, effective for salt-and-pepper noise. Bilateral Filter = Edge-preserving smoothing filter, weights pixels based on both spatial proximity and intensity similarity. Guided Image Filter = Uses a separate 'guide' image to influence the filtering of the target image, enhancing edges. Morphological Filters = Used in binary image processing for shape manipulation, including dilation and erosion. Signup and view all the answers

Match the concepts related to Fourier Transform:

Frequency Domain = Representation of an image in terms of its frequency components rather than spatial pixels. Low-pass Filter = Filter that passes low frequencies, used for blurring and smoothing images. High-pass Filter = Filter that passes high frequencies, used for edge detection and sharpening. Band-pass Filter = Filter that passes a specific range of frequencies, used for texture analysis and feature extraction. Signup and view all the answers

Match the Image Resizing Techniques with their descriptions:

Upsampling (Interpolation) = Enlarging an image by estimating pixel values in between existing pixels. Downsampling (Decimation) = Shrinking an image by reducing the number of pixels, often with pre-filtering to avoid aliasing. Image Pyramids = Multi-resolution representations of images, used for multi-scale analysis and efficient processing. Bilinear Interpolation = A simple interpolation method using linear interpolation in two directions. Signup and view all the answers

Match the Machine Learning types with their learning approach:

Supervised Learning = Learning from labeled input-output pairs to predict outputs for new inputs. Unsupervised Learning = Discovering patterns and structure in unlabeled data without explicit output labels. Semi-supervised Learning = Utilizing both a small amount of labeled data and a large amount of unlabeled data for learning. Reinforcement Learning = Learning through interaction with an environment, receiving rewards or penalties for actions. Signup and view all the answers

Match the Supervised Learning Algorithms with their characteristics:

K-Nearest Neighbors (KNN) = Non-parametric algorithm, classifies based on the majority class of the k-nearest neighbors in the feature space. Bayesian Classification = Probabilistic approach using Bayes' theorem to calculate posterior probabilities for class membership. Logistic Regression = Linear model for binary classification, predicting probabilities using a sigmoid function. Support Vector Machines (SVMs) = Finds optimal hyperplanes to maximize the margin between classes, effective in high-dimensional spaces. Signup and view all the answers

Match the Unsupervised Learning Algorithms with their applications:

K-Means Clustering = Partitions data into k clusters by iteratively assigning data points to the nearest cluster center. Gaussian Mixture Models (GMMs) = Models data as a mixture of Gaussian distributions, useful for density estimation and soft clustering. Principal Component Analysis (PCA) = Dimensionality reduction technique finding principal components that capture maximum variance in data. Manifold Learning = Techniques for dimensionality reduction that assume data lies on a lower-dimensional manifold embedded in a higher-dimensional space. Signup and view all the answers

Match the Deep Learning concepts with their descriptions:

Activation Functions = Introduce non-linearity in neural networks, enabling complex representations. Backpropagation = Algorithm for training neural networks by calculating gradients and updating weights to minimize loss. Convolutional Neural Networks (CNNs) = Neural networks specialized for image processing, using convolutional layers to learn spatial hierarchies. Recurrent Neural Networks (RNNs) = Neural networks designed for processing sequential data like video and text, maintaining temporal dependencies. Signup and view all the answers

Match the Regularization Techniques with their purpose:

L2 Regularization (Weight Decay) = Shrinks large weights to prevent overfitting and improve generalization. L1 Regularization (Lasso) = Can drive some weights to zero, effectively performing feature selection. Dropout = Randomly sets a proportion of neuron activations to zero during training to reduce overfitting. Data Augmentation = Increases the size and diversity of the training dataset by applying transformations to existing samples. Signup and view all the answers

Match the Advanced Optimization Algorithms with their features:

Stochastic Gradient Descent (SGD) = Basic optimization algorithm using gradients from individual samples or mini-batches. Momentum = Speeds up convergence by adding a memory-like effect to gradient descent, averaging past gradients. AdaGrad = Adapts learning rates for each parameter based on the frequency of features. RMSProp = Adjusts learning rates by dividing by an exponentially weighted average of squared gradients. Signup and view all the answers

Match the Convolutional Neural Network Architectures with their key innovations:

AlexNet (2012) = Kickstarted deep learning in computer vision, using ReLUs, dropout, and data augmentation. VGG (2014) = Demonstrated the effectiveness of deep networks with repeated small 3x3 convolutions. GoogLeNet (2015) = Introduced the Inception module for efficient multi-scale feature extraction. ResNet (2016) = Enabled training of very deep networks through skip connections, addressing vanishing gradient problems. Signup and view all the answers

Match the concepts related to Visualizing Neural Networks:

Network Weights Visualization = Visualizing the weights of neural network layers to understand learned features. Activation Visualization = Visualizing the activations of neurons in different layers to see how networks respond to inputs. Feature Map Analysis = Techniques to understand how different image regions activate network units. Class Activation Mapping (Grad-CAM) = Visualizing areas in an image that most influence the model's output classification. Signup and view all the answers

Match the Generative Models with their descriptions:

Variational Autoencoders (VAEs) = Generative models creating a probabilistic field to model data distribution, useful for generating diverse samples. Generative Adversarial Networks (GANs) = Dual-network setup with a generator and discriminator competing to generate realistic data. Generative Pre-trained Transformer (GPT) = Transformer-based model for text generation and language understanding, adaptable to other modalities. Boltzmann Machines = Energy-based models used for learning complex probability distributions, predecessors to deep learning. Signup and view all the answers

Match the terms related to Batch Normalization:

Normalization Layer = Layer that normalizes the activations of the previous layer in each mini-batch. Zero Mean and Unit Variance = The target distribution to which each mini-batch is normalized. Internal Covariate Shift = Problem reduced by batch normalization, where the distribution of network activations changes during training. Training Stability = Improved by batch normalization, leading to faster and more reliable convergence. Signup and view all the answers

Match the terms related to Decision Trees and Forests:

Decision Tree = A tree-like structure for classification or regression, making decisions based on feature thresholds. Random Forest = Ensemble of decision trees, improving robustness and generalization through averaging predictions. Tree Depth = A design parameter controlling the complexity of decision trees, deeper trees can overfit. Information Gain = Criterion used in decision tree learning to select the best features for splitting nodes. Signup and view all the answers

Match the concepts related to Image Pyramids:

Gaussian Pyramid = Pyramid created by repeated Gaussian smoothing and downsampling, for multi-scale representation. Laplacian Pyramid = Pyramid storing detail differences between Gaussian pyramid levels, enabling image reconstruction. Downsampling = Reducing image resolution, often by halving dimensions and applying a low-pass filter. Upsampling = Increasing image resolution, using interpolation techniques to estimate pixel values. Signup and view all the answers

Match the Loss Functions with their primary application areas:

Cross-entropy Loss = Primarily used for classification tasks, measuring the difference between predicted and true probability distributions. L2 Loss (Mean Squared Error) = Commonly used for regression tasks, measuring the squared difference between predicted and target values. L1 Loss (Mean Absolute Error) = Used for regression, more robust to outliers compared to L2 loss. Perceptual Losses = Used in image synthesis, aiming to match high-level perceptual features between generated and target images. Signup and view all the answers

Match the techniques used in Efficient Nearest Neighbor Search:

FLANN (Fast Library for Approximate Nearest Neighbors) = Specialized library for fast approximate nearest neighbor search. Faiss = Library optimized for very large-scale similarity search, GPU-enabled for speed. Randomized k-d Trees = Data structure used for efficient nearest neighbor search in high-dimensional spaces. Locality-Sensitive Hashing (LSH) = Technique for approximate nearest neighbor search by hashing similar items into the same buckets. Signup and view all the answers

Match the Active Research Topics in Computer Vision with their focus areas:

Object Recognition = Developing algorithms to identify and classify objects in images and videos. Human Behavior Analysis = Using computer vision to understand and interpret human actions and interactions. Internet and Computer Vision = Leveraging the vast amount of visual data on the internet for computer vision tasks. Medical Image Processing = Applying computer vision techniques to analyze and interpret medical images for diagnosis and treatment. Signup and view all the answers

Match the Image Processing Operations with their effects:

Image Smoothing = Reduces noise and sharp details, blurring the image. Edge Detection = Highlights boundaries between regions with significant intensity changes. Image Sharpening = Enhances edges and fine details, making the image appear crisper. Region Segmentation = Divides an image into distinct regions based on properties like color or texture. Signup and view all the answers

Match the Image Filtering techniques with their characteristics:

Linear Filters = Operations where the output pixel value is a linear combination of input pixel values. Non-linear Filters = Operations that do not rely on linear combinations, often for noise reduction while preserving edges. Frequency Domain Filtering = Manipulating image frequencies using Fourier Transform to achieve effects like blurring or sharpening. Spatial Domain Filtering = Applying filters directly to pixel values in the image space. Signup and view all the answers

Match the terms related to Image Homogeneous Coordinates:

Homogeneous Coordinates = A system to represent points in projective space, adding an extra dimension. Invariant to Scaling = Property of homogeneous coordinates where scaling the coordinates does not change the point. Point in Cartesian = Represented as a ray in Homogeneous coordinates. Point at Infinity = Represented in homogeneous coordinates with the last component set to zero. Signup and view all the answers

Match the following historical periods with the computer vision task that was most actively researched during that era:

1970s = Interpreting selected images 1980s = Geometric modeling and mathematical rigor 1990s = Face recognition and statistical analysis 2010s = Deep learning Signup and view all the answers

Match the tasks to their description in computer vision:

Object recognition = Identifying specific objects in an image Machine inspection = Rapid parts inspection for quality assurance Optical character recognition = Reading handwritten codes on letters Motion capture = Capturing actors’ movements for computer animation Signup and view all the answers

Match the concepts to the descriptions:

Visual data = 90% of internet traffic Computer Vision = Machine learning applied to visual data 3D to 2D Conversion = Implies loss of information Machine Learning = Algorithms that enable the change of computer behavior based on the data Signup and view all the answers

Match the term to the description:

Image filtering = Process of modifying a picture to enhance certain features or remove noise Computer vision = The analysis of pictures and videos in order to achieve results similar to those of people Vision = Discovering what is present in the world and where it is by looking Machine Learning = A scientific discipline that is concerned with the design and development of algorithms that allow computers to change behavior based on data Signup and view all the answers

Match the following descriptions to the stage in the vision process they describe

Image Acquisition = Eye and Camera Image Interpretation = Brain and Computer Shape, Illumination, and Color distribution = Properties to reconstruct Radiometry, Optics, and Sensor design = Developed with physics Signup and view all the answers

Match the method to the description

Stitching = Turning overlapping photos into a single seamless photo Morphing = Turning a picture into another person 3D modelling = Converting photos into a 3D model Exposure bracketing = Merging multiple exposures under strong lighting conditions Signup and view all the answers

Match the concept to the definition

Pinhole camera = Captures pencil of rays through a single point Vanishing point = Parallel lines converge Focal length = Distance between the lens and the image sensor Aperture = Opening that constrains the rays of lights Signup and view all the answers

Match the lens flaws to their description

Chromatic aberration = Different refractive indices for different wavelengths Spherical aberration = Lenses do not focus light perfectly Radial distortion = Caused by imperfect lenses Vignetting = The edges of an image being darker than the center Signup and view all the answers

Match the component of the eye to its descriptor

Retina = Comparable to the film inside a camera Macula lutea = The area providing the clearest most distinct vision Fovea centralis = An area where all the photoreceptors are cones: there are no rods in the fovea Iris = Colored annulus with radial muscles Signup and view all the answers

Match the types of light sensitive receptors to their characteristics:

Rods = Highly sensitive, and operate at night Cones = Operate in high light, and provide color vision Macula = Cone concentrated area Retina = Film inside the eye's camera Signup and view all the answers

Match item to its description:

Light Source = Point, area and sun are examples Perception = Brightness can be affected by light, shadows and source Texture = Can affect how light interact with the surface Color = Vision trying to reconstruct a property such as its illumination Signup and view all the answers

Match light phenomena with their description:

Absorption = When a photon is absorbed by the material Refraction = Light bends as it passes into water Transparency = Material allows photons to pass through it with little scattering Interreflection = Light bounces between multiple surfaces before reaching the viewer Signup and view all the answers

Match each technique with its use:

Image stitching = Creating seamless panoramas Exposure bracketing = Merging photos taken under strong sunlight Image morphing = Blending photos between multiple exposures 3D modeling = Converting photos to 3D model of the object Signup and view all the answers

Match the following steps to the correct order in basic linear filtering

1 = Take a small neighborhood of pixels around the image 2 = Multiply their values by corresponding weights 3 = Add them up to become the value of a pixel in a output image 4 = Repeat for all pixel x = Any Order Signup and view all the answers

Match the following with their mathematical representation

Linear fitler: correlation = $ g(i,j) = \sum_{k,l} f(i + k,j + l) \cdot h(k,l) $ Linear filter: convolution = $ g(i,j) = \sum_{k,l} f(i - k,j - l) \cdot h(k,l) $ Homogeneous conversion = $(x, y) \Rightarrow\begin{bmatrix}x\y\1\end{bmatrix}$ Signup and view all the answers

Match the technique that solve's the issue

Homogeneous points = How to account for points at infinity? Non-skewed pixel = The angles of the axes may not be perpendicular? Transalation (t) = The camera's position in the world? Scaling of the image = Object farther way appear smaller? Signup and view all the answers

Match the following words with their definition regarding Deep Learning and Computer Visison

Accuracy = The number of correct predictions made by the model Overfitting = When a model learns the training data too well Loss Function = The penalty for incorrection prediction Weights = Store the network's knowledge across all of the neurons. Signup and view all the answers

Match the types of Layer's in the Deep Neural Network to their defintion

Fully Connected Layer: FC = Use a dense weight matrix with connections between all inputs and outputs: Deep Neural Network (CNN) = Replacee dense weight matrices with sparse convolutional kernels Pooling = Is a technique to reduce the spatial dimensions X = Any Other Combination Signup and view all the answers

Match image analysis and processing techniques to real-world applications.

Edge Detection = Medical image processing Active Vision = Active vision has the ability to control the way the next picture will be Object Detection = Self-driving car Pattern Recognition = Facial recognition Signup and view all the answers

Match methods with components used during it's application in transfer learning:

Fine tuning = Weights (adjust pre-trained model weights) Head Replacement = Layers (Replace the final layers) Weight Decay = Shrinks wights to prevent over fitting Signup and view all the answers

Flashcards

Inverse Problem

Computer vision aims to describe the world that we see in images and reconstruct its properties like shape and illumination.

Machine Learning

A scientific discipline concerned with the design and development of algorithms that allow computers to change behavior based on data.

Computer Vision

Describes the world that we see in one or more images and to reconstruct its properties, such as shape, illumination, and color distributions.

Difference between Machine learning and computer vision

In machine learning it usually does not care about how to obatain the date or sensors, where as in computer vision we deeply care about obtaining visual data.