Classification and Prediction

Podcast

Play an AI-generated podcast conversation about this lesson

Download our mobile app to listen on the go

Get App

Questions and Answers

What type of class labels does classification primarily predict?

Time-series class labels
Categorical class labels (correct)
Numerical class labels
Continuous-valued labels

In the context of classification, what constitutes the primary function of the training set?

To classify future and unseen objects.
To estimate the accuracy of the model.
To normalize data to improve performance.
To construct a model based on classifying attributes and class labels. (correct)

In classification, what is the role of the 'test sample' in evaluating the model?

To categorize loan applications as safe or risky.
To predict the expenditures of potential customers.
To identify irrelevant or redundant attributes.
To compare its known label against the model's classified result. (correct)

What is the primary goal of 'relevance analysis' in data preparation for classification?

To remove irrelevant or redundant attributes. (D) Signup and view all the answers

Which of the following is a critical consideration when evaluating classification methods?

Handling noisy and missing values, also known as robustness. (A) Signup and view all the answers

What is an 'internal node' in the context of decision tree induction?

Denotes a test on an attribute. (A) Signup and view all the answers

In decision tree induction, what is the purpose of 'tree pruning'?

To identify and remove branches that reflect noise or outliers. (B) Signup and view all the answers

Which of the following criteria does the basic algorithm for decision tree induction use?

A top-down recursive divide-and-conquer manner. (D) Signup and view all the answers

What is a key requirement for attributes used in the basic algorithm for decision tree induction?

They must be categorical. (C) Signup and view all the answers

What condition must be met for recursive partitioning to stop?

When all samples for a given node belong to the same class. (C) Signup and view all the answers

According to the decision tree algorithm, what action is taken when there are no samples for a particular branch `test-attribute = ai`?

A leaf is created with the majority class in the samples. (D) Signup and view all the answers

During decision tree induction, once an attribute has been used at a node, why is it generally not considered in any of the node’s descendants?

Because the data is already partitioned based on that attribute’s values. (D) Signup and view all the answers

In the context of decision tree induction, what does the 'splitting criterion' primarily determine?

Which attribute to test at a node. (A) Signup and view all the answers

When dealing with a continuous-valued attribute `A` in decision tree induction, what conditions define the two possible outcomes at a node `N`?

<code>A <= split_point</code> or <code>A > split_point</code>. (A) Signup and view all the answers

In decision tree induction, if a discrete-valued attribute `A` is used to produce a binary tree, and the test at node `N` is of the form '`A ∈ SA?`', what does `SA` represent?

The splitting subset for <code>A</code> returned by the attribute selection method. (C) Signup and view all the answers

What is the output of the 'Generate_decision_tree' algorithm?

A decision tree. (D) Signup and view all the answers

In the context of decision tree algorithms, what does the term 'majority voting' refer to?

Converting a node into a leaf and labeling it with the class that appears most frequently among the samples. (D) Signup and view all the answers

What is the fundamental purpose of Attribute Selection by Information Gain Computation in decision tree construction?

To select the attribute that best separates the samples into individual classes. (B) Signup and view all the answers

Why is 'Information Gain' sometimes biased in decision tree induction?

It favors tests with many outcomes. (D) Signup and view all the answers

How does 'Gain Ratio' attempt to improve upon 'Information Gain' in decision tree induction?

By applying a kind of normalization to information gain using a 'split information' value. (A) Signup and view all the answers

What is the result of using gain ratio on a dataset?

The splitting attribute best suited, of all available values, is chosen. (B) Signup and view all the answers

In the context of extracting classification rules from decision trees, what does each 'path' from the root to a leaf represent?

A complete IF-THEN rule. (D) Signup and view all the answers

What is a primary advantage of decision tree induction in data mining?

It offers relatively faster learning speed. (D) Signup and view all the answers

Which of the following is a necessary step with continuous-valued attributes within the algorithm for decision tree induction?

They must be discretized in advance. (D) Signup and view all the answers

What condition must exist regarding the test set in relation to the training set?

The test set must be independent of the training set. (C) Signup and view all the answers

Within the data transformation step of data preparation, what does 'data generalization' refer to?

Generalizing data to higher level concepts within a concept hierarchy. (D) Signup and view all the answers

How does the algorithm handle a scenario where all remaining attributes have already been used for partitioning?

It employs majority voting to classify the node. (A) Signup and view all the answers

What is the significance of splitting criterion in the context of creating a decision tree?

It is related to what attribute the decision tree will test at node N. (D) Signup and view all the answers

If attribute `A` is used to produce a binary decision tree that is discrete, how should the test at node `N` be formatted?

A ∈ SA? (C) Signup and view all the answers

When a test gives too many outcomes, the information gain is biased. How can this be accounted for?

Normalize the information gain via a split information value. (B) Signup and view all the answers

When creating decision trees, why do we seek to create the best set(s) possible?

The resulting partitions are as 'pure' as possible. (D) Signup and view all the answers

What is a strength with Decision Tree Induction?

Fast speed. (C) Signup and view all the answers

If a discrete attribute A is used to create a binary decision tree where the test at Node N is is `A ∈ SA?`, and a tuple does not satisfy the test at Node N, then what should the Node that doesn't satisfy the test be labeled as?

no (C) Signup and view all the answers

How can decision trees to be easier to understand?

Converted to a set of human understandable IF-THEN rules. (C) Signup and view all the answers

If all the data in partition D, symbolized by Node N, belongs to the same class, what does this mean?

Recursive partitioning can stop. (D) Signup and view all the answers

What format does Knowledge need to be in to represent data in decision trees?

IF-THEN rules (C) Signup and view all the answers

Assume you are using attribute `A` and the set `S` is partitioned into `{S1, S2, ..., Sv}` where `{1,..., v}` are the likely values of `A`. If the set `S1` has `PI` examples of `P` and `NI` examples of `N`, what is the equation for entropy?

$E(A) = \sum_{i=1}^{v} \frac{p_i + n_i}{p+n} I (p_i, n_i)$ (A) Signup and view all the answers

Assume that there are two groups $P$ and $N$ . $P$ represents $p$ elements of that group, and $N$ represents $n$ elements of that group. What function defines the entropy function?

$I(p,n)= - \frac{p}{p+n} log_2{\frac{p}{p+n}} - \frac{n}{p+n} log_2{\frac{n}{p+n}}$ (B) Signup and view all the answers

Flashcards

Classification

Predicting categorical class labels, constructs a model based on training data and uses it to classify new data.

Numeric Prediction

Predicting unknown or missing continuous-valued functions or values.