Naïve Bayes for Text Classification

Podcast

Play an AI-generated podcast conversation about this lesson

Download our mobile app to listen on the go

Get App

Questions and Answers

Which of the following is a core task of text classification?

Generating document translations.
Summarizing document content.
Sentiment analysis. (correct)
Extracting key entities from text.

In text classification, what constitutes the 'input' to a classification model?

A set of predefined rules.
A predicted class.
A labeled dataset.
A document and a fixed set of classes. (correct)

What is the primary drawback of using hand-coded rules for text classification?

They are difficult to interpret.
They often have low accuracy.
They are computationally expensive to run.
They are costly to build and maintain. (correct)

Which of the following is an output of a supervised machine learning approach to text classification?

A learned classifier. (D) Signup and view all the answers

Which of the following machine learning algorithms is commonly used for text classification?

Naïve Bayes. (D) Signup and view all the answers

What is the core principle behind the Naïve Bayes approach to text classification?

Bayes' rule with strong independence assumptions. (D) Signup and view all the answers

In the context of Naïve Bayes, what is a 'bag of words' representation?

A document represented by the counts of its words. (C) Signup and view all the answers

Given a document d and a class c, according to Bayes' Rule, what is proportional to the probability $P(c \mid d)$?

$P(d \mid c) * P(c)$ (A) Signup and view all the answers

In the Naïve Bayes classifier, what is the significance of the 'naïve' assumption?

It simplifies calculations but can reduce accuracy. (D) Signup and view all the answers

In Multinomial Naïve Bayes, what does conditional independence imply?

The feature probabilities are independent given the class. (A) Signup and view all the answers

What is a practical reason for using logarithms when calculating probabilities in Naïve Bayes?

To prevent underflow due to multiplying many small probabilities. (A) Signup and view all the answers

How does taking the logarithm of probabilities affect the ranking of classes in Naïve Bayes?

It doesn't change the ranking. (C) Signup and view all the answers

In the context of Naïve Bayes, what is Maximum Likelihood Estimation (MLE) used for?

Estimating the parameters of the model from the training data. (B) Signup and view all the answers

In Naïve Bayes, what problem does Laplace (add-1) smoothing address?

Zero probabilities for unseen term-class combinations. (D) Signup and view all the answers

What is the typical approach in Naïve Bayes for handling words that appear in the test data but not in the training data?

Ignoring them. (B) Signup and view all the answers

What is the primary purpose of removing stop words in text classification?

To reduce noise and improve the efficiency of the classifier. (B) Signup and view all the answers

What is the purpose of using binary multinominal Naive Bayes?

To emphasize word occurence over word frequency. (A) Signup and view all the answers

In Binary Multinomial Naive Bayes, how are duplicate words treated in a test document?

They are removed to avoid bias towards frequent terms. (B) Signup and view all the answers

Why is accuracy not always the best metric for evaluating text classification models?

It can be misleading with imbalanced datasets. (A) Signup and view all the answers

In the context of evaluating a text classification system, what does 'precision' measure?

The percentage of predicted positives that were actually correct. (A) Signup and view all the answers

What does 'recall' measure in text classification evaluation?

The proportion of actual positive instances correctly identified by the model. (C) Signup and view all the answers

Why are both precision and recall important in evaluating text classification systems?

They provide a complete picture of the classifier's performance. (B) Signup and view all the answers

What is the F-measure?

A single metric that combines precision and recall. (D) Signup and view all the answers

Why is the harmonic mean often used when calculating the F-measure?

It punishes extreme values in either precision or recall. (D) Signup and view all the answers

In a multiclass classification problem, what is macroaveraging used for?

Averaging the performance metrics across all classes. (D) Signup and view all the answers

What is microaveraging in the context of evaluating a multiclass classification?

Computing a single precision and recall from the combined decisions of all classes. (A) Signup and view all the answers

What is the purpose of using a development test set?

To tune models and select features. (B) Signup and view all the answers

What are development test sets and cross-validation primarily used for?

To quantify model generalization. (D) Signup and view all the answers

What is k-fold cross-validation primarily used for?

Estimating how accurately a predictive model will perform in practice. (D) Signup and view all the answers

In k-fold cross-validation, how is the data typically divided?

Into k non-overlapping subsets. (A) Signup and view all the answers

What is the purpose of performing multiple runs and reporting average perforamance in k-fold cross validation?

To get an estimate of algorithm performance. (A) Signup and view all the answers

Which of the following evaluation metrics is most sensitive to class imbalance?

Accuracy (B) Signup and view all the answers

A text classification model is designed to detect positive tweets about a company. It identifies 80 out of 100 real positive tweets, but also flags 20 negative tweets as positive. What are its recall and precision scores respectively?

Recall=0.8, Precision=0.8 (B) Signup and view all the answers

A text classification model detects 50 spam emails, with 40 of them actually being spam. What is the precision of the model?

0.8 (C) Signup and view all the answers

A text classification model correctly identifies 70 relevant documents out of a total of 100 relevant documents. What is the recall score?

0.7 (A) Signup and view all the answers

In a binary text classification task, a model has a precision of 0.6 and a recall of 0.8. What is the F1 score?

0.7 (A) Signup and view all the answers

Consider that Classifier A has P: 0.53, R: 0.36 and Classifier B has P: 0.01, R: 0.99. Which classifier has the better F-measure?

Classifier A (B) Signup and view all the answers

In a 3-class classification problem (urgent, normal, spam), which averaging method would be most appropriate if you want to ensure that each class contributes equally to the overall performance measure, regardless of its size?

Macroaveraging (C) Signup and view all the answers

In k-fold cross validation, if the value of k is increased, what potential impact will that have on the bias and variance of your estimate?

Higher variance, lower bias. (C) Signup and view all the answers

Flashcards

Text Classification

The task of assigning a document to one or more predefined categories based on its content.

Sentiment Analysis

Identifying the sentiment (positive, negative, or neutral) expressed in a piece of text.