Handout 7 Machine Learning PDF

Data Science Tools and Software Dr. Mohamed Mahfouz Classification& Regression Classification vs Clustering – Classification (known categories) – Clustering (unknown categories) Category “A” Category “B” Clustering Classification (Recognition) (Unsupervised (Supervised Classification) Classification) 2 Pattern Recognition Tasks 1) Classification: Given a collection of records (training set), each record contains a set of attributes , one of the attributes is the class (Classifier), find a model for class attribute as a function of the values of other attributes. 2) Clustering :Given a set of data points, each having a set of attributes, and a similarity measure among them. Find clusters such that Data points in one cluster are more similar to one another. Data points in separate clusters are less similar to one another. Other related Tasks: - Association Rule Discovery: Given a set of records each of which contains some set of items from a given collection of items. Produce dependency rules that predict occurrences of one item based on the occurrences of some other items. An Early Task: - - regression analysis is a statistical process for estimating the relationships among variables. It includes many techniques for modeling and analyzing several variables Pattern Recognition Applications 4 a Linear Classifiers x f yest f(x,w,b) = sign(w x + b) denotes +1 w x + b>0 denotes -1 How would you classify this data? w x + b50 and b >70 then class = 1 otherwise class=-1 M2: if a>65 and c >75 then class = 1 otherwise class= -1 a b c class 65 57 54 1 45 65 48 -1 70 46 62 -1 48 91 87 1 61 33 38 1 66 59 76 -1 58 84 53 1 KNN Regression Example – (k=1) Age Loan House Price Index Distance 25 $40,000 135 102000 35 $60,000 256 82000 45 $80,000 231 62000 20 $20,000 267 122000 35 $120,000 139 22000 52 $18,000 150 124000 23 $95,000 127 47000 40 $62,000 216 80000 60 $100,000 139 42000 48 $220,000 250 78000 33 $150,000 264 8000 48 $142,000 ? D = ( x1 − x2 ) 2 + ( y1 − y 2 ) 2 For K>1 take the average of house price of k neighbors www.ismartsoft.com 22 KNN Regression – Standardized Distance Age Loan House Price Index Distance 0.125 0.11 135 0.7652 0.375 0.21 256 0.5200 0.625 0.31 231 0.3160 0 0.01 267 0.9245 0.375 0.50 139 0.3428 0.8 0.00 150 0.6220 0.075 0.38 127 0.6669 0.5 0.22 216 0.4437 1 0.41 139 0.3650 0.7 1.00 250 0.3861 0.325 0.65 264 0.3771 0.7 0.61 ? X − Min Xs = Max − Min www.ismartsoft.com 23 Performance Metrics of Regression Root Mean Square Error : Relative Absolute Error : Root Relative Squared Error : 𝜃ҧ is the average of the realized values K-folds cross-validation The steps: Divide the training set into k partitions (folds) Choose one partition out of the k partions for test. Train the classifier using the remaining k-1 partitions. Test the classifier using the test fold. Count an error if it is misclassified. Repeat the above k times by keeping a different fold each time for testing and using the remaining for training. Compute the error probability by averaging the counted errors 25 Leave-one-out Method The steps: Choose one sample out of the N. Train the classifier using the remaining N-1 samples. Test the classifier using the selected sample. Count an error if it is misclassified. Repeat the above by excluding a different sample each time. Compute the error probability by averaging the counted errors 26 Cross validation Example Consider the following k-nearest table of data points. What is the accuracy of k-nearest neighbor on them using leave-one-out cross validation method (k=1)? Object xi Nearest Label of xi Neighbors x1 x3,x4,x6 -1 x2 x4,x6,x3 1 x3 x5,x2,x4 1 x4 x6,x3,x1 -1 x5 x4,x2,x1 1 x6 x4,x1,x2 -1 Rationale for Ensemble Learning There is no algorithm that is always the most accurate Generate a group of base-learners which when combined have higher accuracy Different learners use different – Algorithms – Parameters – Different features – Different Samples – Subproblems 28 Voting Linear combination L y = w jd j j =1 L w j  0 and w j =1 j =1 Classification L y i =  w j d ji j =1 29 Clustering algorithms Hierarchical algorithms: Create a hierarchical decomposition of the set of objects using some criterion while Partitional algorithms Construct various partitions and then evaluate them by some criterion (k-means, k-medoids) Partitional Clustering is Nonhierarchical, each instance is placed in exactly one of K nonoverlapping clusters. Since only one set of clusters is output, the user normally has to input the desired number of clusters K. Hierarchical Partitional Algorithm k-means 1. Decide on a value for k. 2. Initialize the k cluster centers (randomly, if necessary). 3. Decide the class memberships of the N objects by assigning them to the nearest cluster center. 4. Re-estimate the k cluster centers, by assuming the memberships found above are correct. 5. If none of the N objects changed membership in the last iteration, exit. Otherwise goto 3. K-means Clustering: Step 1 Algorithm: k-means, Distance Metric: Euclidean Distance 5 4 k1 3 k2 2 1 k3 0 0 1 2 3 4 5 K-means Clustering: Step 2 Algorithm: k-means, Distance Metric: Euclidean Distance 5 4 k1 3 k2 2 1 k3 0 0 1 2 3 4 5 K-means Clustering: Step 3 Algorithm: k-means, Distance Metric: Euclidean Distance 5 4 k1 3 2 k3 k2 1 0 0 1 2 3 4 5 Objective function Let xl, x2,….. ,Xn Rn be a finite number of patterns. To partition patterns into k partitions, 2 < k < n, the following 10 mathematical program is considered: Minimize 9 8 7 6 k n 5 J ( w, z ) =  wij d ( x j , zi ) i =1 j =1 4 3 Objective Function 2 Where wij =0 or 1 1 d(xj,zi) is the distance between object xj and cluster center zi 1 2 3 4 5 6 7 8 9 10 K-means Clustering: Step 4 Algorithm: k-means, Distance Metric: Euclidean Distance 5 4 k1 3 2 k3 k2 1 0 0 1 2 3 4 5 K-means Clustering: Step 5 Algorithm: k-means, Distance Metric: Euclidean Distance expression in condition 2 5 4 k1 3 2 k2 k3 1 0 0 1 2 3 4 5 expression in condition 1 Comments on the K-Means Method Strength – Relatively efficient: O(tkn), where n is # objects, k is # clusters, and t is # iterations. Normally, k, t

Handout 7 Machine Learning PDF

Document Details

Tags

Related

Summary

Full Transcript