Metrics for Evaluating Classifier Performance

Study Notes

Classifiers predict class labels (e.g., Yes/No, Spam/Not Spam) based on training data.
Evaluation metrics are crucial for assessing model accuracy and effectiveness.

Accuracy: Ratio of correct predictions to total samples; misleading if class sizes are imbalanced.
Confusion Matrix: A table showing true positive, true negative, false positive, and false negative predictions to assess model performance.
Precision: Proportion of true positive predictions among all positive predictions; reflects model relevance.
Recall: Proportion of true positives identified from all actual positives; shows sensitivity of the model.

Decision trees split data subsets based on attribute values to create branches.
Splits aim for pure partitions where all tuples in a child node belong to the same class.
Key techniques include:
- Gini Index: Measures impurity in datasets.
- Entropy: Assesses randomness or impurity in data.

Identifies which feature maximally decreases entropy during decision tree split.
Calculated as the difference between original information requirement and the new requirement after partitioning.
High information gain indicates a strong candidate for root node splitting.

Pre-Pruning:
- Limits model complexity before tree creation.
- Techniques include setting maximum depth and minimum samples per leaf.
Post-Pruning:
- Simplifies tree after growth to enhance generalization.
- Involves techniques such as Cost-Complexity Pruning and Reduced Error Pruning.

Overfitting: Model captures noise in training data, failing to generalize well.
Underfitting: Model fails to capture the underlying trends in the data.

Holdout Method: Divides data into training, validation, and test sets for performance evaluation.
K-Fold Cross-Validation: Splits data into k subsets; trains and validates k times, each time using a different fold for validation.

Classification Metrics:
- Accuracy, Precision, Recall, F1 Score, ROC-AUC.
Regression Metrics:
- Mean Absolute Error (MAE), Mean Squared Error (MSE), Root Mean Squared Error (RMSE), R-Squared.
Other Metrics: Logarithmic Loss, Confusion Matrix for capturing prediction performance.

Cross-Validation: Ensures reliable performance estimates; helps avoid over/underfitting.
Grid Search: Exhaustive parameter search, usually combined with cross-validation.
Random Search: Randomly samples parameter combinations, offering efficiency.
Bayesian Optimization: Builds a probabilistic model for exploring parameter spaces efficiently.