7 Questions
Match the following machine learning tasks with their descriptions:
Supervised Learning = Training examples have provided labels Unsupervised Learning = No labels provided for each training example Clustering = Grouping similar data points together Anomaly detection = Identifying abnormal or unusual data points
Match the following dimensionality reduction techniques with their descriptions:
Principle Component Analysis (PCA) = Identifies and extracts important features in a dataset PCA Process = Standardizes the data and finds new principal components Principal Components = Orthogonal variables capturing maximum variance in the data Dimensionality Reduction = Reduces dataset's dimension while preserving variance
Match the steps of Principle Component Analysis (PCA) with their descriptions:
Step 2: Compute the covariance matrix = Calculate the covariance matrix based on the standardized data to show relationships and variances between pairs of variables. Step 3: Compute the eigenvectors and eigenvalues = Obtain eigenvectors and eigenvalues by decomposing the covariance matrix, where each eigenvector represents a principal component and its corresponding eigenvalue represents the amount of variance explained. Step 4: Select the principal components = Determine the number of principal components to retain based on the amount of variance explained by each principal component, often using a threshold like retaining components that explain a certain percentage of the total variance. Step 5: Project the data onto the new feature space = Project the original data onto the selected principal components to obtain a reduceddimensional representation by taking the dot product of the data and the selected principal components.
Match the applications of Principle Component Analysis (PCA) with their descriptions:
Dimensionality reduction = By selecting a subset of principal components, PCA reduces the dimensionality of the dataset, making it easier to visualize and analyze. Data visualization = PCA can be used to visualize highdimensional data by projecting it onto a lowerdimensional space, typically two or three dimensions. Noise reduction = By keeping only the principal components that capture most of the variance, PCA can remove noise and retain the most informative features. Feature extraction = PCA can be used to extract the most important features from a dataset and discard less relevant or redundant features.
Match the mathematical operations in PCA with their descriptions:
Calculate the mean = Determine the mean value for each variable in the dataset. Compute the covariance matrix = Calculate the covariance matrix based on the standardized data to show relationships and variances between pairs of variables. Compute eigenvectors and eigenvalues = Obtain eigenvectors and eigenvalues by decomposing the covariance matrix, where each eigenvector represents a principal component and its corresponding eigenvalue represents the amount of variance explained. Select principal components = Determine the number of principal components to retain based on the amount of variance explained by each principal component, often using a threshold like retaining components that explain a certain percentage of the total variance.
Match the steps in reducing dimensionality using PCA with their descriptions:
Calculate mean for each variable = Determine the mean value for each variable in the dataset. Compute covariance matrix = Calculate the covariance matrix based on the standardized data to show relationships and variances between pairs of variables. Select principal components = Determine the number of principal components to retain based on the amount of variance explained by each principal component, often using a threshold like retaining components that explain a certain percentage of the total variance. Project data onto new feature space = Project the original data onto the selected principal components to obtain a reduceddimensional representation by taking the dot product of the data and the selected principal components.
Match PCA applications with their benefits:
Dimensionality reduction = Reduces complexity and makes it easier to visualize and analyze data. Data visualization = Enables representation of highdimensional data in lower dimensions for easier understanding and interpretation. Noise reduction = Removes irrelevant or less informative features from datasets, leading to cleaner data analysis results. Feature extraction = Identifies and retains essential information while discarding redundant or less relevant features.
Study Notes
Machine Learning Tasks
 Classification: involves predicting a categorical label or class that an instance belongs to
 Regression: involves predicting a continuous or numerical value
 Clustering: involves grouping similar instances together
 Dimensionality Reduction: involves reducing the number of features or variables in a dataset
Dimensionality Reduction Techniques
 Principal Component Analysis (PCA): linear technique that projects highdimensional data onto a lowerdimensional space
 tSNE: nonlinear technique that preserves local relationships in the data
 Autoencoders: neural networks that learn to compress and reconstruct data
Steps of Principle Component Analysis (PCA)
 Data Standardization: subtracting the mean and dividing by the standard deviation for each feature
 Covariance Matrix Calculation: computing the covariance between each pair of features
 Eigenvector and Eigenvalue Calculation: solving for the eigenvectors and eigenvalues of the covariance matrix
 Component Selection: selecting the top k eigenvectors corresponding to the k largest eigenvalues
 Transformation: projecting the original data onto the selected eigenvectors
Applications of Principle Component Analysis (PCA)
 Data Visualization: reducing dimensionality for visualization in lowerdimensional spaces
 Anomaly Detection: identifying outliers and anomalies in the data
 Feature Extraction: selecting the most informative features in a dataset
 Noise Reduction: removing noise and correlations in the data
Mathematical Operations in PCA
 Eigen Decomposition: decomposing a matrix into eigenvectors and eigenvalues
 Matrix Multiplication: projecting the original data onto the selected eigenvectors
 Orthogonal Projections: projecting data onto a lowerdimensional space
Steps in Reducing Dimensionality using PCA
 Selecting the Number of Components: choosing the number of dimensions to reduce to
 Computing the Component Scores: projecting the original data onto the selected eigenvectors
 Transforming the Data: converting the original data into the lowerdimensional space
PCA Applications and Benefits

Facial Recognition: reducing dimensionality for efficient face recognition
 Benefit: improved computational efficiency

Gene Expression Analysis: identifying relevant genes in microarray data
 Benefit: improved feature selection and identification of key genes

Image Compression: reducing dimensionality for efficient image compression
 Benefit: improved storage and transmission efficiency
This quiz covers the fundamentals of supervised and unsupervised learning, focusing on the differences between them and the concept of unsupervised tasks such as clustering, anomaly detection, and dimensionality reduction including Principle Component Analysis (PCA).
Make Your Own Quizzes and Flashcards
Convert your notes into interactive study material.
Get started for free