BA 3551 Review PDF
Document Details
Uploaded by Deleted User
Carlson School of Management
Jim Nord
Tags
Related
- DWDM-UNIT-3 NOTES PDF
- Dimensionality Reduction - Feature Selection PDF
- Veri Madenciliği Ders Notları PDF
- Feature Selection For DEA
- Ortaokul Öğrencilerinin Akademik Başarılarının Eğitsel Veri Madenciliği Yöntemleri İle Tahmini PDF
- Chapter 6: Preparing to Model the Data - Discovering Knowledge in Data 2014 PDF
Summary
This document provides a review of data mining methods. It covers topics such as association rules, cluster analysis, classification, and numeric prediction. The review is intended for students in a business analytics course.
Full Transcript
BA 3551 Review Jim Nord Information and Decision Sciences Carlson School of Management Email: [email protected] 1 Data Mining Methods UNSUPERVISED/DESCRIPTIVE LEARNING...
BA 3551 Review Jim Nord Information and Decision Sciences Carlson School of Management Email: [email protected] 1 Data Mining Methods UNSUPERVISED/DESCRIPTIVE LEARNING ASSOCIATION RULES Investigating the co- METHODS occurrence of items, Descriptive and exploratory in nature. events, etc. Data is mined to uncover previously (Apriori Algorithm) unknown, interesting patterns without CLUSTER ANALYSIS having a clear outcome in mind Understanding if data can Make use of unlabeled data: there is no be naturally grouped into specified outcome variable clusters Assessing how “good” the results are can be (Hierarchical a little subjective. Clustering, k-Means) SUPERVISED/PREDICTIVE LEARNING CLASSIFICATION Predicting categorical METHODS values of outcome Prediction methods variable Data is mined with a clear objective in mind: K-NN predicting a specific outcome. Decision Trees Make use of labeled data: we need to have data NUMERIC PREDICTION where the outcome variable is known Predicting continuous Assessing the performance of the model is done values of an outcome variable relying on objective metrics K-NN Regression Trees Association Rule Mining GENERAL IDEA: Investigating the co-occurrence of items, events, variables and so on. Find out which items-sets co-occur Association Rule: X Y: {Milk, Diapers} {Coke} {X and Y} Support metrics ount{X}/Tot. Transactions ount{X and Y}/Tot. Transactions Support Count Support Frequency (or Percentage) Count ( X ,Y ) Conf ( X → Y ) = Count ( X ) Confidence [0, 1] Given X, what is the prob. of having Y Lift [0, Infinite) How much more likely item-sets co-occur than pure chance? Supp Perc( X → Y ) Conf ( X → Y ) Lift ( X → Y )= = Supp Perc ( X ) ∗Supp Perc(Y ) Supp Perc(Y ) What does it mean to have Lift >,