Classification - Icybm101 Machine Learning And Data Mining PDF

Classi ca on Chapter 3 ICYBM101 Machine Learning and Data Mining 30 September 2024 Prof. dr. Katrien Beuls Faculté d’informa que, Université de Na...

Classi ca on Chapter 3 ICYBM101 Machine Learning and Data Mining 30 September 2024 Prof. dr. Katrien Beuls Faculté d’informa que, Université de Namur https://unamur.be/info www.unamur.be fi ti ti MNIST Dataset » Set of 70,000 small images of digits handwritten by high school students and employees of the US Census Bureau www.unamur.be MNIST Dataset » Each image has 784 features » 28 x 28 pixels » Each feature represents one pixel's intensity, from 0 (white) to 255 (black) www.unamur.be Inspecting the data... www.unamur.be Performance measures Measuring accuracy using cross-validation » k-fold cross-validation with three folds » Splitting the training set into 3 folds, then training the model 3 times, holding out a different fold each time for evaluation » Above 95% accuracy on all cross-validation folds! baseline? www.unamur.be Performance measures Confusion matrix » General idea: Count the number of times instances of class A are classified as class B, for all A/B pairs. predicted predicted 5s non-5s true negatives false positives negative class non-5s 53892 687 false negatives true positives positive class 5s 1891 3530 www.unamur.be Performance measures Precision and recall » Precision = accuracy of positive predictions TP 3530 precision = = 0.84 TP + FP 3530 + 687 » Recall = true positive rate TP 3530 recall = = 0.65 TP + FN 3530 + 1891 www.unamur.be Illustration www.unamur.be F1 score 2 precision × recall TP F1 = =2× = 1 + 1 precision + recall TP + FN + FP precision recall 2 3530 3530 precision = = 0.84 recall = = 0.65 3530 + 687 3530 + 1891 3530 F1 = 1891 + 687 = 0.73 3530 + 2 www.unamur.be The Precision/recall tradeoff » How did the classifier make its decision? www.unamur.be The Precision/recall tradeoff www.unamur.be The Precision/recall tradeoff www.unamur.be The receiver operator characteristic (ROC) curve » Very similar to precision/recall curve » Plots instead the true positive rate (i.e. recall) against the false positive rate (FPR). » FPR (also called fall-out) is the ratio of negative instances that are incorrectly classified as positive. » 1 - true negative rate (TNR, also called specificity) www.unamur.be The receiver operator characteristic (ROC) curve A ROC curve plotting the false positive rate against the true positive rate for all possible thresholds; the black circle highlights the chosen ratio (at 90% precision and 48% recall) www.unamur.be Comparing PR curves Figure 3-8. Comparing PR curves: the random forest classi er is superior to the SGD classi er because its PR curve is much closer to the top-right corner, and it has a greater AUC www.unamur.be fi fi Performing multiclass classification with multiple binary classifiers » One-versus-the-rest (OvR) strategy » One-versus-one (OvO) strategy » N x (N-1) / 2 classifiers needed » MNIST: 45 binary classifiers (!) » Each classifier needs only to be trained on the part of the training set containing the two classes it must distinguish www.unamur.be Error analysis Coloured confusion matrix www.unamur.be Error analysis Confusion matrix with errors only www.unamur.be Error analysis Gaining insights into ways to improve » Efforts should be spent on reducing the false 8s!! » Try to gather more training data for digits that look like 8s (but are not) » Engineer new features that would help the classifier (e.g. count the number of closed loops) » Preprocess the images to make some patterns stand out more www.unamur.be Error analysis Analysing individual errors data augmentation www.unamur.be Multilabel classification » E.g. Face-recognition classifier » Attach one tag per person it recognises in a picture » Simpler example: two target labels for each digit image » Large (7, 8 or 9) and odd ? » K-nearest neighbour classifier www.unamur.be Evaluating a multilabel classifier » Measure the F1 score for each individual label, then simply compute the average score » Assumes all labels are equally important » Assign a weight to each label equal to its support (i.e. the number of instances with that target label) www.unamur.be Multioutput-multiclass classification » Generalisation of multilabel classification, where each label can be a multiclass (i.e. it can have more than two possible values) » Example: Removing noise from images www.unamur.be In sum » How to select good metrics for classification tasks » Pick the appropriate precision/recall trade-off » Compare classifiers » Build good classification systems for a variety of tasks www.unamur.be https://github.com/ageron/handson-ml3 Next Monday: First exercise session -- 'End-to-end ML project' www.unamur.be

Classification - Icybm101 Machine Learning And Data Mining PDF

Document Details

Tags

Related

Summary

Full Transcript