Data Consistency in Measurements Quiz

What is the key difference between PCA and Factor Analysis (FA) in terms of the number of axes?

The number of axes in PCA is equal to the number of variables, while in FA it is limited to a few factors. (correct)
Both PCA and FA have the same number of axes.
The number of axes in FA is equal to the number of variables, while in PCA it is limited to a few factors.
PCA has no limit on the number of axes, while FA is limited to a few factors.

What is the purpose of data discretization?

To group numerical data into categories for analysis. (correct)
To confuse the analysis by making data less understandable.
To increase the variance in the dataset.
To convert categorical data into numerical data.

In equal width binning, how is the data sorted for grouping?

From smallest to largest. (correct)
From largest to smallest.
No specific sorting method.
Based on random selection.

What does equal depth binning ensure?

Equal proportions of data in each category. (B) Signup and view all the answers

What is one common issue caused by outliers in data analysis?

Causing skewedness and affecting the distribution. (C) Signup and view all the answers

How does equal width binning handle skewed data?

It replaces skewed data with median values. (C) Signup and view all the answers

What is the purpose of checking if scales are similar for columns with measurements?

To ensure consistency in the units of measurement (C) Signup and view all the answers

What is the main objective of Model Planning in the software for Data Pre-Processing?

Determine methods and workflow for model building (A) Signup and view all the answers

What is the function of the testing set in Model Building phase?

To establish the accuracy of the model (A) Signup and view all the answers

In geospatial datasets, why is it important to check if abbreviations of locations are consistent?

To ensure accurate geographic referencing (A) Signup and view all the answers

What differentiates the testing set from the training set in Model Building phase?

Training set helps the algorithm learn, while testing set evaluates model accuracy (A) Signup and view all the answers

What role does Model Building phase play in developing datasets for production?

It allows testing of the final model with live data (B) Signup and view all the answers

What is the percentage of errors in the predictions?

8.5% (A) Signup and view all the answers

Which attribute was identified as having the best ability to increase group homogeneity?

Income (D) Signup and view all the answers

What percentage of rows remain after removing 37 from a total of 600?

93.83% (B) Signup and view all the answers

What is the likelihood of an individual saying 'yes' in the group with income greater than $51,284.3?

80% (A) Signup and view all the answers

When using 'Region' as the attribute for splitting, what percentage of rows are involved?

20.83% (C) Signup and view all the answers

How many rows are left after considering 'Age' as an attribute?

55 (D) Signup and view all the answers

What is the main disadvantage of increasing the number of epochs to an infinite number?

Increased validation loss (A) Signup and view all the answers

In what scenario is SVM preferred over ANN?

Nonlinearly separable data (B) Signup and view all the answers

What transformation is needed to move from a linear to a nonlinear boundary in SVM?

Data transformation into higher dimensional space (B) Signup and view all the answers

How do kernel methods help in SVM?

They transform data into higher dimensional spaces for easier separation (C) Signup and view all the answers

Why are ensemble classification techniques considered better than decision trees?

They combine multiple models for improved accuracy and robustness (D) Signup and view all the answers

How is the relationship between soloist and orchestra analogous to the relationship between decision trees and ensembles?

Ensembles generally outperform an individual decision tree (D) Signup and view all the answers

What is the main purpose of a perceptron in classification?

To determine the class of a data point based on a separating line (D) Signup and view all the answers

In the context of support vector machines, what do support vectors represent?

Vectors used to define the plane separating two classes (D) Signup and view all the answers

What is a common characteristic of an invalid line in classification using a perceptron?

It passes through both red and green dots (A) Signup and view all the answers

Why are the input values normalized before being input into the perceptron for classification?

To ensure equal dispersion of values (D) Signup and view all the answers

What happens when a data point has a negative number output after being input into the perceptron?

It is classified as belonging to the 'No' category (B) Signup and view all the answers

How is a perceptron line used to classify data points?

By determining which side of the line a point falls on (D) Signup and view all the answers