Machine Learning 1 Week 2 Lecture PDF

Machine Learning 1 Week 2 - Lecture Supervised Machine Learning Categories of Analytics Categories of Analytics What do these terms mean? Category Description Descriptive Diagnostic Predictive Prescriptive Categories of Analytics Category Description Question Descriptive Describe past performance, What happened? history, trends Diagnostic Describe reason for trend Why did it happen? Predictive Forecast future trends What will happen? Prescriptive Recommend decision or How can we make it course of action happen? Categories of Analytics Question Category Which customer did churn? Why did the customer churn? Which customer will churn? What can I do to change the outcome of customer churn? Categories of Analytics Question Category Which customer did churn? Descriptive Why did the customer churn? Diagnostic Which customer will churn? Predictive What can I do to change the outcome of Prescriptive customer churn? The Bias Variance Tradeoff Statistical Learning Bias Variance Tradeoff Minimizing the error requires minimizing the bias and variance Variance is always non-negative - keep it as is Bias can be negative; hence, we square it The best we can do is live with the irreducible error It is easy to optimize for one. The idea is to optimize for both (bias and variance) When we don’t know f, we can’t calculate bias and variance. We know it is there. We calculate the resulting error function as a proxy for how close we are to f. Statistical vs. Machine Learning Thinking Statistical Thinking Inference about population from samples Quantifying uncertainty Hypothesis testing Validating assumptions Machine Learning Thinking Algorithmic approach to finding patterns in data Creating models that make accurate predictions on new, unseen data Experimental approach – try different learning algorithms and method and find out which approach might get you a good result Machine Learning vs. Statistical Thinking Simplicity Occam's Razor Use the simplest model that gets the job done Sometimes, we have no choice but to increase complexity to accomplish what we need Machine Learning vs. Statistical Thinking Divergence Data pre-processing and model validation ML: Imputation, transformation STATS: Sample size, hypothesis testing, validating assumptions Machine Learning vs. Statistical Thinking Outliers ML: How does it impact my prediction? STATS: What does this mean? Machine Learning vs. Statistical Thinking Feature Selection ML: Select features based on impacting prediction (the multiplicity of good models) STATS: Focus on interpreting relationships Machine Learning vs. Statistical Thinking Results ML: prediction accuracy STATS: inference, confidence intervals. Learning to Drive Feature Extraction and Scaling Feature Selection Dimensionality Reduction Sampling Labels Training Dataset Learning Final Model New Data Labels Algorithm Raw Test Dataset Data Labels Preprocessing Learning Evaluation Prediction Model Selection Cross-Validation Performance Metrics Hyperparameter Optimization Binary Classifiers Binary Classifiers In classification, the goal is to predict a class label, which is a choice from a predefined list of possibilities. Binary classification vs. multiclass classification Binary classification Is this email spam? Will this patient have a heart attack? Thresholds and Confusion Matrix Evaluating a Classification Model Evaluation should be based on test samples that are not used for training the model. TP: Positive example predicted as positive FN: Positive example predicted as negative FP: Negative example predicted as positive TN: Negative example predicted as negative Predicted Positive Negative Positive TP FN Actual Negative FP TN Confusion Matrix Evaluating a Classification Model The Decision Boundary is a threshold (t) that can be moved to bias a classifier towards a class (positive or negative) Evaluating a Classification Model If I asked you predict how many published papers will win a noble prize and you predicted no to all papers, what do you think the impact is on the accuracy of your prediction model? 𝐶𝑜𝑟𝑟𝑒𝑐𝑡 𝑃𝑟𝑒𝑑𝑖𝑐𝑡𝑖𝑜𝑛𝑠 Accuracy = 𝑇𝑜𝑡𝑎𝑙 𝑂𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛𝑠 Structure of Training and Prediction Split data: X_train, y_train; X_test; y_test = splitData(X, y) Create model: model = modelName(parameters) Train model: model.fit(X_train, y_train) where: x_train: inputs for training examples, y_train: outputs for training examples Predict with trained model on test examples: y_predicted = model.predict(X_test) where: x_test: inputs for examples to predict Check predictive accuracy of trained model on test examples: accuracy_score(y_test, y_predicted) where: y_test: correct outputs for predicted examples Threshold The output of many models is a probability (between 0 and 1). For many classification problems, we use the default threshold setup (above 0.5, we predict 1, and below, we predict 0). Read https://developers.google.com/machine-learning/crash-course/classification/thresholding Threshold What would be a good reason setup a high threshold? What would be a good reason to setup a low threshold? Threshold Take a binary classifier and experiment with changing threshold values. Observe the impact on the classification metrics ROC Curve for Classifier Evaluation

Machine Learning 1 Week 2 Lecture PDF

Document Details

Tags

Related

Summary

Full Transcript