Summary

This document is a tutorial on Python Machine Learning, focusing on the Train/Test method and Confusion Matrix. The tutorial provides code examples and explanations for evaluating machine learning models.

Full Transcript

Python ML Tutorial TRAIN/TEST Train / Test Evaluate Your Model In Machine Learning we create models to predict the outcome of certain events, like where we predicted the CO2 emission of a car when we knew the weight and engine size. To measure if the model is good enough, we can use a meth...

Python ML Tutorial TRAIN/TEST Train / Test Evaluate Your Model In Machine Learning we create models to predict the outcome of certain events, like where we predicted the CO2 emission of a car when we knew the weight and engine size. To measure if the model is good enough, we can use a method called Train/Test. What is Train/Text Train/Test is a method to measure the accuracy of your model. It is called Train/Test because you split the data set into two sets: a training set and a testing set. For example: you could use 80% for training, and 20% for testing. You train the model using the training set. You test the model using the testing set. Train the model means create the model. Test the model means test the accuracy of the model. Train / Test Start with a Data Set Our data set illustrates 100 customers in a shop, and their shopping habits. Example import numpy import matplotlib.pyplot as plt numpy.random.seed(2) X = numpy.random.normal(3, 1, 100) y = numpy.random.normal(150, 40, 100) / X plt.scatter(X, y) plt.show() Result: The x axis represents the number of minutes before making a purchase. The y axis represents the amount of money spent on the purchase. 1 Train / Test Split Into Train/Test The training set should be a random selection of 80% of the original data. The testing set should be the remaining 20%. train_X = X[:80] train_y = y[:80] test_X = X[80:] test_y = y[80:] Example ‐ Display the same scatter plot with the training set: plt.scatter(train_X, train_y) plt.show() Result: It looks like the original data set, so it seems to be a fair selection: Train / Test Display the Testing Set To make sure the testing set is not completely different, we will take a look at the testing set as well. Example plt.scatter(test_X, test_y) plt.show() Result: The testing set also looks like the original data set: Train / Test Fit the Data Set What does the data set look like? In my opinion I think the best fit would be a polynomial regression, so let us draw a line of polynomial regression. To draw a line through the data points, we use the plot() method of the matplotlib module: Example ‐ Draw a polynomial regression line through the data points: import numpy import matplotlib.pyplot as plt numpy.random.seed(2) X = numpy.random.normal(3, 1, 100) y = numpy.random.normal(150, 40, 100) / X train_X = X[:80] train_y = y[:80] test_X = X[80:] test_y = y[80:] mymodel = numpy.poly1d(numpy.polyfit(train_X, train_y, 4)) myline = numpy.linspace(0, 6, 100) plt.scatter(train_X, train_y) plt.plot(myline, mymodel(myline)) plt.show() 2 Train / Test The result can back the suggestion of the data set fitting a polynomial regression, even though it would give us some weird results if we try to predict values outside of the data set. Example: the line indicates that a customer spending 6 minutes in the shop would make a purchase worth 200. That is probably a sign of overfitting. But what about the R‐squared score? The R‐squared score is a good indicator of how well my data set is fitting the model. R2 Remember R2, also known as R‐squared? It measures the relationship between the x axis and the y axis, and the value ranges from 0 to 1, where 0 means no relationship, and 1 means totally related. The sklearn module has a method called r2_score() that will help us find this relationship. In this case we would like to measure the relationship between the minutes a customer stays in the shop and how much money they spend. Train / Test Example ‐ How well does my training data fit in a polynomial regression? import numpy from sklearn.metrics import r2_score numpy.random.seed(2) X = numpy.random.normal(3, 1, 100) y = numpy.random.normal(150, 40, 100) / X train_X = X[:80] train_y = y[:80] test_X = X[80:] test_y = y[80:] mymodel = numpy.poly1d(numpy.polyfit(train_X, train_y, 4)) r2 = r2_score(train_y, mymodel(train_X)) print(r2) Note: The result 0.799 shows that there is a OK relationship. Train / Test Bring in the Testing Set Now we have made a model that is OK, at least when it comes to training data. Now we want to test the model with the testing data as well, to see if gives us the same result. Example ‐ Let us find the R2 score when using testing data: import numpy from sklearn.metrics import r2_score numpy.random.seed(2) X = numpy.random.normal(3, 1, 100) y = numpy.random.normal(150, 40, 100) / X train_X = X[:80] train_y = y[:80] test_X = X[80:] test_y = y[80:] mymodel = numpy.poly1d(numpy.polyfit(train_X, train_y, 4)) r2 = r2_score(test_y, mymodel(test_X)) print(r2) Note: The result 0.809 shows that the model fits the testing set as well, and we are confident that we can use the model to predict future values. 3 Train / Test Predict Values Now that we have established that our model is OK, we can start predicting new values. Example ‐ How much money will a buying customer spend, if she or he stays in the shop for 5 minutes? print(mymodel(5)) The example predicted the customer to spend 22.88 dollars, as seems to correspond to the diagram: Python ML Tutorial CONFUSION MATRIX Confusion Matrix What is a Confusion Matrix? It is a table that is used in classification problems to assess where errors in the model were made. The rows represent the actual classes the outcomes should have been. While the columns represent the predictions we have made. Using this table it is easy to see which predictions are wrong. Creating a Confusion Matrix Confusion matrixes can be created by predictions made from a logistic regression. For now we will generate actual and predicted values by utilizing NumPy. 4 Confusion Matrix The target variable has two values: Positive or Negative The columns represent the actual values of the target variable The rows represent the predicted values of the target variable True Positives (TP): when the actual value is Positive and predicted is also Positive. True negatives (TN): when the actual value is Negative and prediction is also Negative. False positives (FP): When the actual is negative but prediction is Positive. Also known as Type 1 error False negatives (FN): When the actual is Positive but the prediction is Negative. Also known as Type 2 error Confusion Matrix The target variable has two values: Positive or Negative The rows represent the actual values of the target variable The columns represent the predicted values of the target variable True Positives (TP): when the actual value is Positive and predicted is also Positive. True negatives (TN): when the actual value is Negative and prediction is also Negative. False positives (FP): When the actual is negative but prediction is Positive. Also known as the Type 1 error False negatives (FN): When the actual is Positive but the prediction is Negative. Also known as the Type 2 error Confusion Matrix Let’s take an example: we have a total of 20 cats and dogs and our model predicts whether it is a cat or not. Actual values: ‘dog’, ‘cat’, ‘dog’, ‘cat’, ‘dog’, ‘dog’, ‘cat’, ‘dog’, ‘cat’, ‘dog’, ‘dog’, ‘dog’, ‘dog’, ‘cat’, ‘dog’, ‘dog’, ‘cat’, ‘dog’, ‘dog’, ‘cat’ Predicted values: ‘dog’, ‘dog’, ‘dog’, ‘cat’, ‘dog’, ‘dog’, ‘cat’, ‘cat’, ‘cat’, ‘cat’, ‘dog’, ‘dog’, ‘dog’, ‘cat’, ‘dog’, ‘dog’, ‘cat’, ‘dog’, ‘dog’, ‘cat’ 5 Confusion Matrix #Import the necessary libraries #compute the confusion matrix. import numpy as np cm = confusion_matrix(actual,predicted) from sklearn.metrics import confusion_matrix import seaborn as sns #Plot the confusion matrix. import matplotlib.pyplot as plt sns.heatmap(cm, annot=True, fmt='g', #Create the NumPy array for actual and predicted labels. xticklabels=['Positive (CAT)','Negative (DOG)'], actual = np.array( yticklabels=['Positive (CAT)','Negative (DOG)']) ['dog', 'cat', 'dog', 'cat', 'dog', 'dog', 'cat', 'dog', 'cat', 'dog', plt.xlabel('Predicted Values',fontsize=13) 'dog', 'dog', 'dog', 'cat', 'dog', 'dog', 'cat', 'dog', 'dog', 'cat']) plt.ylabel('Actual Values',fontsize=13) predicted = np.array( plt.title('Confusion Matrix',fontsize=17) ['dog', 'dog', 'dog', 'cat', 'dog', 'dog', 'cat', 'cat', 'cat', 'cat', 'dog', 'dog', 'dog', 'cat', 'dog', 'dog', 'cat', 'dog', 'dog', 'cat']) plt.show() Confusion Matrix True Positive (TP) = 6 You predicted positive and it’s true. You predicted that an animal is a cat and it actually is. True Negative (TN) = 11 You predicted negative and it’s true. You predicted that animal is not a cat and it actually is not (it’s a dog). False Positive (Type 1 Error) (FP) = 2 You predicted positive and it’s false. You predicted that animal is a cat but it actually is not (it’s a dog). False Negative (Type 2 Error) (FN) = 1 You predicted negative and it’s false. You predicted that animal is not a cat but it actually is. Confusion Matrix True Positive (TP) = 6 You predicted positive and it’s true. You predicted that an animal is a cat and it actually is. True Negative (TN) = 11 You predicted negative and it’s true. You predicted that animal is not a cat and it actually is not (it’s a dog). False Positive (Type 1 Error) (FP) = 2 You predicted positive and it’s false. You predicted that animal is a cat but it actually is not (it’s a dog). False Negative (Type 2 Error) (FN) = 1 You predicted negative and it’s false. You predicted that animal is not a cat but it actually is. 6 Confusion Matrix Classification Measures are an extended version of the confusion matrix. There are measures other than the confusion matrix which can help achieve better understanding and analysis of our model and its performance.  Accuracy  Precision  Recall (TPR, Sensitivity)  F1‐Score  FPR (Type I Error)  FNR (Type II Error) Confusion Matrix Accuracy simply measures how often the classifier makes the correct prediction. It’s the ratio between the number of correct predictions and the total number of predictions. The accuracy metric is not suited for imbalanced classes. Accuracy has its own disadvantages, for imbalanced data, when the model predicts that each point belongs to the majority class label, the accuracy will be high. But, the model is not accurate. It is a measure of correctness that is achieved in true prediction. In simple words, it tells us how many predictions are actually positive out of all the total positive predicted. Accuracy is a valid choice of evaluation for classification problems which are well balanced and not skewed or there is no class imbalance. Confusion Matrix Precision is a measure of correctness that is achieved in true prediction. In simple words, it tells us how many predictions are actually positive out of all the total positive predicted. Precision is defined as the ratio of the total number of correctly classified positive classes divided by the total number of predicted positive classes. Or, out of all the predictive positive classes, how much we predicted correctly. Precision should be high (ideally 1). 7 Confusion Matrix Precision is a useful metric in cases where False Positive is a higher concern than False Negatives Ex 1: In Spam Detection : Need to focus on precision Suppose mail is not a spam but model is predicted as spam : FP (False Positive). We always try to reduce FP. Ex 2: Precision is important in music or video recommendation systems, e‐commerce websites, etc. Wrong results could lead to customer churn and be harmful to the business. Confusion Matrix Recall is a measure of actual observations which are predicted correctly, i.e. how many observations of positive class are actually predicted as positive. It is also known as Sensitivity. Recall is a valid choice of evaluation metric when we want to capture as many positives as possible. Recall is defined as the ratio of the total number of correctly classified positive classes divided by the total number of positive classes. Or, out of all the positive classes, how much we have predicted correctly. Recall should be high (ideally 1). Confusion Matrix Recall is a useful metric in cases where False Negative outweighs False Positive. Ex 1: Suppose person having cancer (or) not? He is suffering from cancer but model predicted as not suffering from cancer Ex 2: Recall is important in medical cases where it doesn’t matter whether we raise a false alarm but the actual positive cases should not go undetected! Recall would be a better metric because we don’t want to accidentally discharge an infected person and let them mix with the healthy population thereby spreading contagious virus. Now you can understand why accuracy was a bad metric for our model. 8 Confusion Matrix The F1 score is a number between 0 and 1 and is the harmonic mean of precision and recall. We use harmonic mean because it is not sensitive to extremely large values, unlike simple averages. F1 score tries to maintain a balance between the precision and recall for your classifier. If your precision is low, the F1 is low. And if the recall is low, again your F1 score is low. There will be cases where there is no clear distinction between whether Precision is more important or Recall. We combine them! Confusion Matrix In practice, when we try to increase the precision of our model, the recall goes down and vice‐ versa. The F1‐score captures both of the trends in a single value. F1 score is a harmonic mean of Precision and Recall. As compared to Arithmetic Mean, Harmonic Mean punishes the extreme values more. F‐score should be high (ideally 1). Confusion Matrix Sensitivity versus Specificity Sensitivity (Recall): Of all positive cases, what percentage are predicted positive? Sensitivity (sometimes called Recall) measures how good the model is at predicting positives. This means it looks at true positives and false negatives (which are positives that have been incorrectly predicted as negative). How to Calculate: True Positive / (True Positive + False Negative) Sensitivity is good at understanding how well the model predicts something is positive. Specificity: How well the model is at predicting negative results? Specificity is similar to sensitivity, but looks at it from the perspective of negative results. How to Calculate: True Negative / (True Negative + False Positive) Since it is just the opposite of Recall, we use the recall_score function, taking the opposite position label: Example: Specificity = metrics.recall_score(actual, predicted, pos_label=0) 9 Confusion Matrix Confusion Matrix #Calculate performance measures: print('Confusion Matrix \n', cm, '\n') from sklearn import metrics print("Accuracy:",metrics.accuracy_score(actual, predicted)) print("Precision:",metrics.precision_score(actual, predicted, pos_label = 'cat')) print("Recall:",metrics.recall_score(actual, predicted, pos_label = 'cat')) print("Specificity:",metrics.recall_score(actual, predicted, pos_label = 'dog')) print("F1‐score", metrics.f1_score(actual, predicted, pos_label = 'cat')) #the classification report computes all the above. print('\nClassification Report') print(metrics.classification_report(actual, predicted)) Confusion Matrix Is it necessary to check for recall (or) precision if you already have a high accuracy? We can not rely on a single value of accuracy in classification when the classes are imbalanced. For example, we have a dataset of 100 patients in which 5 have diabetes and 95 are healthy. However, if our model only predicts the majority class i.e. all 100 people are healthy even though we have a classification accuracy of 95%. When to use Accuracy / Precision / Recall / F1‐Score?  Accuracy is used when the True Positives and True Negatives are more important. Accuracy is a better metric for Balanced Data.  Whenever False Positive is much more important, use Precision.  Whenever False Negative is much more important, use Recall.  F1‐Score is used when the False Negatives and False Positives are both important. Also, F1‐Score is a better metric for imbalanced data. 10 Confusion Matrix We will need to generate the numbers for "actual" and "predicted" values. actual = numpy.random.binomial(1, 0.9, size = 1000) predicted = numpy.random.binomial(1, 0.9, size = 1000) In order to create the confusion matrix we need to import metrics from the sklearn module. Once metrics is imported we can use the confusion matrix function on our actual and predicted values. from sklearn import metrics confusion_matrix = metrics.confusion_matrix(actual, predicted) To create a more interpretable visual display we need to convert the table into a confusion matrix display. cm_display = metrics.ConfusionMatrixDisplay(confusion_matrix = confusion_matrix, display_labels = [False, True]) Vizualizing the display requires that we import pyplot from matplotlib. Finally to display the plot we can use the functions plot() and show() from pyplot. import matplotlib.pyplot as plt cm_display.plot() plt.show() Confusion Matrix Example 2: import matplotlib.pyplot as plt import numpy as np from sklearn import metrics actual = np.random.binomial(1,.9,size = 1000) predicted = np.random.binomial(1,.9,size = 1000) confusion_matrix = metrics.confusion_matrix(actual, predicted) cm_display = metrics.ConfusionMatrixDisplay(confusion_matrix = confusion_matrix, display_labels = [False, True]) cm_display.plot() plt.show() Confusion Matrix Results Explained The Confusion Matrix created has four different quadrants: True Negative False Positive False Negative True Positive True means that the values were accurately predicted, False means that there was an error or wrong prediction. Now that we have made a Confusion Matrix, we can calculate different measures to quantify the quality of the model. First, lets look at Accuracy. 11 Confusion Matrix Created Metrics The matrix provides us with many useful metrics that help us to evaluate out classification model. The different measures include: Accuracy Precision Sensitivity (Recall) Specificity F‐score Accuracy: Accuracy measures how often the model is correct. How to Calculate: (True Positive + True Negative) / Total Predictions Example: Accuracy = metrics.accuracy_score(actual, predicted) Confusion Matrix Precision: Of the positives predicted, what percentage is truly positive? How to Calculate: True Positive / (True Positive + False Positive) Precision does not evaluate the correctly predicted negative cases: Example: Precision = metrics.precision_score(actual, predicted) Sensitivity (Recall): Of all positive cases, what percentage are predicted positive? Sensitivity (sometimes called Recall) measures how good the model is at predicting positives. This means it looks at true positives and false negatives (which are positives that have been incorrectly predicted as negative). How to Calculate: True Positive / (True Positive + False Negative) Sensitivity is good at understanding how well the model predicts something is positive. Example: Sensitivity_recall = metrics.recall_score(actual, predicted) Confusion Matrix Specificity: How well the model is at predicting negative results? Specificity is similar to sensitivity, but looks at it from the perspective of negative results. How to Calculate: True Negative / (True Negative + False Positive) Since it is just the opposite of Recall, we use the recall_score function, taking the opposite position label: Example: Specificity = metrics.recall_score(actual, predicted, pos_label=0) F‐score: F‐score is the “harmonic mean” of precision and sensitivity. It considers both false positive and false negative cases and is good for imbalanced datasets. How to Calculate: 2 * ((Precision * Sensitivity) / (Precision + Sensitivity)) This score does not take into consideration the True Negative values: Example: F1_score = metrics.f1_score(actual, predicted) 12 Confusion Matrix All calculations in one: #metrics Accuracy = metrics.accuracy_score(actual, predicted) Precision = metrics.precision_score(actual, predicted) Sensitivity_recall = metrics.recall_score(actual, predicted) Specificity = metrics.recall_score(actual, predicted, pos_label=0) F1_score = metrics.f1_score(actual, predicted) print({"Accuracy":Accuracy, "Precision":Precision, "Sensitivity_recall":Sensitivity_recall, "Specificity":Specificity, "F1_score":F1_score}) {'Accuracy': 0.842, 'Precision': 0.9120879120879121, 'Sensitivity_recall': 0.9140969162995595, 'Specificity': 0.13043478260869565, 'F1_score': 0.9130913091309131} References https://www.w3schools.com/python/python_ml_getting_started.asp https://scikit‐learn.org/ https://scikit‐learn.org/stable/user_guide.html https://medium.com/analytics‐vidhya/what‐is‐a‐confusion‐matrix‐d1c0f8feda5 13

Use Quizgecko on...
Browser
Browser