Python for Rapid Engineering Solutions PDF
Document Details
Uploaded by DarlingBowenite8037
ASU
Steve Millman
Tags
Summary
This document provides an introduction to regression and clustering analysis in Python. It covers linear and polynomial regression, comparing model quality measures like MSE and R^2, and demonstrates how to use Python libraries like pandas, matplotlib, and scikit-learn. It also encompasses k-means clustering and DBSCAN, showing how unsupervised methods can be applied to data.
Full Transcript
Python for Rapid Engineering Solutions Steve Millman Regression Analysis Welcome! Today’s Objectives: Understand regression analysis Apply regression analysis to housing data Example code from Raschka Regression Analysis Rather than classes, map to a continuous variable Example: given fe...
Python for Rapid Engineering Solutions Steve Millman Regression Analysis Welcome! Today’s Objectives: Understand regression analysis Apply regression analysis to housing data Example code from Raschka Regression Analysis Rather than classes, map to a continuous variable Example: given features of a house, calculate price Zillow held contest for this! Prize: $1,000,000 Many ways to do this: linear regression polynomial regression decision trees or random forest and others… Need policy to handle outliers Measuring Model Quality: MSE Mean Square Error: $ 1 ' 𝑀𝑆𝐸 = ' 𝑦 (!) − 𝑦* (!) 𝑛 !"# Average square of distance from actual value Measuring Model Quality: R2 Coefficient of Determination: 𝑆𝑆𝐸 𝑅' =1− 𝑆𝑆𝑇 $ $ ' ' 𝑆𝑆𝐸 = ' 𝑦 (!) − 𝑦* (!) and SST =' 𝑦 (!) − 𝜇( !"# !"# Simplifying, we get 𝑀𝑆𝐸 𝑅' =1− 𝑉𝑎𝑟(𝑦) Housing Data CRIM Per capita crime rate ZN % of residential land zoned for lots over 25k sq ft INDUS % of non-retail acres CHA 1 if on a river; 0 otherwise NOX Nitric Oxide concentration RM Average number of rooms AGE % of owner-occupied built before 1940 DIS Weighted distance to 5 business centers RAD Index of accessibilty to radial highways TAX Full-value property tax rate PTRATIO Pupil-teacher ratio B Measure of population of African descent LSTAT % of lower status of population MEDV Median value of owner-occupied homes in $1000s Data Analysis: Set Up m7_housing1.py import matplotlib.pyplot as plt # for plotting import pandas as pd # for data frame import seaborn as sns # data analysis import numpy as np # to compute correlation from mlxtend.plotting import scatterplotmatrix # a pair plot function df = pd.read_csv('m7_housing.data', header=None, sep='\s+') df.columns = ['CRIM','ZN','INDUS','CHAS','NOX','RM','AGE','DIS', 'RAD','TAX','PTRATIO','B','LSTAT','MEDV'] print(df.head()) Ø python m7_housing1.py CRIM ZN INDUS CHAS... PTRATIO B LSTAT MEDV 0 0.00632 18.0 2.31 0... 15.3 396.90 4.98 24.0 1 0.02731 0.0 7.07 0... 17.8 396.90 9.14 21.6 2 0.02729 0.0 7.07 0... 17.8 392.83 4.03 34.7 3 0.03237 0.0 2.18 0... 18.7 394.63 2.94 33.4 4 0.06905 0.0 2.18 0... 18.7 396.90 5.33 36.2 [5 rows x 14 columns] Data Analysis: Create Charts m7_housing1.py (continued) # use mlxtend to create a pair plot for 5 of the columns cols = ['LSTAT','INDUS','NOX','RM','MEDV'] scatterplotmatrix(df[cols].values,figsize=(10,8),names=cols,alpha=.5) plt.tight_layout() # spread out the charts a bit plt.show() # now compute the correlation coefficients and use seaborn to plot a heat map # annot indicates whether the value should be shown in each square # annot_kws is a dictionary for how to present the text in the squares # fmt is for formatting the text in the squares cm = np.corrcoef(df[cols].values.T) hm = sns.heatmap(cm,cbar=True,annot=True,square=True,fmt='.2f', annot_kws={'size':15}, yticklabels=cols,xticklabels=cols) plt.show() Data Analysis: Pair Plot Data Analysis: Correlation Heat Map Regression Analysis: Set Up m7_housing2.py import matplotlib.pyplot as plt # for plotting import pandas as pd # for data frame from sklearn.model_selection import train_test_split # split the data from sklearn.linear_model import LinearRegression # algorithm to use from sklearn.metrics import mean_squared_error # data analysis from sklearn.metrics import r2_score # data analysis # read the data, assign column names df = pd.read_csv('m7_housing.data', header=None, sep='\s+') df.columns = ['CRIM','ZN','INDUS','CHAS','NOX','RM','AGE','DIS', 'RAD','TAX','PTRATIO','B','LSTAT','MEDV'] X = df.iloc[:,:-1].values # features are all rows and all but last column y = df['MEDV'].values # value to predict is the last column # now do the train/test split X_train, X_test, y_train, y_test = train_test_split(X,y,test_size=0.3,random_state=0) # NOTE: LinearRegression works WITHOUT requiring standarization! Regression Analysis: Train and Test m7_housing2.py (continued): slr = LinearRegression() # instantiate a linear regression tool slr.fit(X_train,y_train) # fit the data y_train_pred = slr.predict(X_train) # predict the training values y_test_pred = slr.predict(X_test) # predict the test values # plot the train and test residuals vs their predictions. # The residual is the difference between the predicted and actual values ax = plt.axes() ax.set_facecolor('grey') plt.scatter(y_train_pred, y_train_pred - y_train, c='blue', marker='x', label='Training data') plt.scatter(y_test_pred, y_test_pred - y_test, c='orange', marker='+', label='Test data') plt.xlabel('Predicted values') plt.ylabel('Residuals') plt.legend(loc='upper left') plt.hlines(y=0,xmin=-10,xmax=50,lw=2,color='red') plt.xlim([-10,50]) plt.show() Ø python m7_housing2.py Regression Analysis: Results (Residuals) Regression Analysis: Quality Check m7_housing2.py (continued): # Calculate measures of the performance of the model. # First, calculate the mean square error of both the train and test set. # The results indicate that we have overfit since test MSE >> train MSE print('MSE train: %.3f, test: %.3f' % ( mean_squared_error(y_train,y_train_pred), mean_squared_error(y_test,y_test_pred))) # Now calculate the coefficient of determination, R^2 print('R^2 train: %.3f, test: %.3f' % \ (r2_score(y_train,y_train_pred),r2_score(y_test,y_test_pred))) MSE train: 19.958, test: 27.196 R^2 train: 0.765, test: 0.673 Python for Rapid Engineering Solutions Steve Millman Clustering Analysis Welcome! Today’s Objectives: Be able to define unsupervised learning Use k-means to classify samples Use DBSCAN to classify samples Example code from Raschka Unsupervised Learning Lots of samples available but no ground truth Various methods – we'll look at two Clustering finds structure in data Common problem Does the data form groups? What does group membership mean? Can be used to find ways samples form groups Can be used for recommendation engines Clustering Goal is to organize samples into clusters Not always clear which algorithm will work best Not always clear there are clusters! Curse of dimensionality: Too many features cause problems! Use dimension reduction techniques, e.g. PCA k-means Methodology Assumes the samples form clusters Guess there are N clusters 1. Randomly pick N center points 2. Assign each sample to the "closest" center point 3. Move the center point so it is at the cluster's center 4. Repeat steps 2 and 3 until at limit or nothing moves k-means++ spreads out the initial center points Assumes spherical shaped clusters k-means: How Many Groups? Unfortunately, you have to decide Can get empty clusters! Plot the within-cluster Sum of Square Error (SSE) SSE is a measure of distance of samples from centers The more clusters, the lower the SSE Heuristic: Plot the SSE and look for the "elbow" k-means: Sum of Square Errors $ & * 𝑆𝑆𝐸 = $ $ 𝑤 (!,%) 𝑥 (!) −𝜇 (%) * !"# %"# where: 𝑛 is the number of samples 𝑘 is the number of clusters 𝑤 (!,%) is 1 if sample 𝑥 (!) is in cluster 𝑗 𝜇 (%) is the centroid of cluster 𝑗 k-means: Controls Use these parameters to control k-means: n_clusters – the number of clusters to use init – specify k-means++ for better initial centroids n_init – # of times to run with different initial centroids max_iter – max # of times to loop through the algorithm tol – measure of how far centers move to declare done random_state – random number seed k-means: Set Up m7_kmeans.py: from sklearn.datasets import make_blobs # create clusters of data from sklearn.cluster import KMeans # cluster analysis algorithm import matplotlib.pyplot as plt # so we can plot the data # create 3 blobs (centers) using 2 features (n_features) and 150 samples. # cluster_std is the standard deviation of each blob # shuffle the samples in random order (rather than 0s, 1s, then 2s) X,y = make_blobs(n_samples=150,n_features=2,centers=3, cluster_std=0.5,shuffle=True,random_state=0) #print(X,"\n",y) # debug print to show samples plt.scatter(X[:,0],X[:,1],c='red',marker='o',s=50) # plot the data plt.grid() plt.title('kmeans cluster data') plt.show() mkrs = ['s','o','v','^','x'] # markers and colors to use for each cluster clrs = ['orange','green','blue','purple','gold'] Ø python m7_kmeans.py k-means: The Data k-means: Train! m7_kmeans.py (continued): inertia = [] # track the Sum of Squares Error (SSE) for numcs in range(1,6): km = KMeans(n_clusters=numcs,init='k-means++', n_init=10,max_iter=300,tol=1e-4,random_state=0) y_km = km.fit_predict(X) inertia.append(km.inertia_) # built-in measure of SSE for clustnum in range(numcs): # X[y_km==clustnum,0] says use the entry in X if the corresponding value # in y_km is equal to clustnum. Same for the x and y coordinates plt.scatter(X[y_km==clustnum,0],X[y_km==clustnum,1], # select samples c=clrs[clustnum], # pick color s=50, # marker size marker=mkrs[clustnum], # which marker label='cluster'+str(clustnum+1)) # which cluster # plot the centers plt.scatter(km.cluster_centers_[:,0],km.cluster_centers_[:,1], s=250,c='red',marker='*',label='centroids') plt.legend() plt.grid() plt.title('kmeans with ' + str(numcs) + ' clusters') plt.show() k-means: One Cluster k-means: Two Clusters k-means: Three Clusters k-means: Four Clusters k-means: Five Clusters k-means: Plot the SSE m7_kmeans.py (continued): plt.plot(list(range(1,len(inertia)+1)),inertia,marker='x') plt.xlabel('number of clusters') plt.ylabel('inertia') plt.title('kmeans cluster analysis') plt.show() DBSCAN – Density Based Analysis Assumes nearby samples belong together So no assumption about shape or number of clusters Can control how "close" they need to be Places samples into 3 categories: 1. core points – those within 𝜀 to min_samples 2. boundary points – within 𝜀 of a core point 3. noise points – all others DBSCAN can remove noise points DBSCAN: Set Up m7_dbscan.py: import matplotlib.pyplot as plt # needed for plotting from sklearn.cluster import KMeans # center-based analysis from sklearn.datasets import make_moons # create moon-shaped data sets from sklearn.cluster import DBSCAN # density-based analysis # make_moons ALWAYS makes 2 interleaving half circles! # will make 200 samples; noise is the standard deviation of Gaussian noise X,y = make_moons(n_samples=200,noise=0.05,random_state=0) plt.scatter(X[:,0],X[:,1],c='blue') plt.title('DBSCAN data') plt.show() # and show the moons Ø python m7_dbscan.py DBSCAN: The Data DBSCAN: Try with k-means m7_dbscan.py (continued): # use the KMeans algorithm and plot it - it fails to do a good job! # (See comments in pml312 for an explanation of the parameters...) km = KMeans(n_clusters=2,init='k-means++',random_state=0) y_km = km.fit_predict(X) plt.scatter(X[y_km==0,0],X[y_km==0,1],s=40, c='blue',marker='o',label='cluster 1') plt.scatter(X[y_km==1,0],X[y_km==1,1],s=40,c='red',marker='s',label='cluster 2') plt.scatter(km.cluster_centers_[:,0],km.cluster_centers_[:,1], s=250,c='black',marker='*',label='centroids') plt.legend() plt.title('kmeans Attempting DBSCAN data') plt.show() DBSCAN: kmeans Fails DBSCAN: Run DBSCAN m7_dbscan.py (continued): # Now do density-based analysis # eps is max distance for two samples to be considered in the same neighborhood # eps is considered the most important parameter for dbscan! # min_samples is the minimum number of samples in a neighborhood to form a core # A cluster must be around a set of "core" samples. # metric is how to measure the distance between points. db = DBSCAN(eps=0.2,min_samples=5,metric='euclidean') y_db = db.fit_predict(X) # fit and predict the cluster labels # and now plot the separated clusters plt.scatter(X[y_db==0,0],X[y_db==0,1], c='blue',marker='o',s=40,label='cluster 1') plt.scatter(X[y_db==1,0],X[y_db==1,1],c='red',marker='s',s=40,label='cluster 2') plt.legend() plt.title('DBSCAN') plt.show() DBSCAN: Success! Python for Rapid Engineering Solutions Steve Millman Deep Learning Welcome! Today’s Objectives: Be able to define a neural network Be able to define deep learning Explore issues with deep learning Understand approaches required for deep learning Investigating Vision Scientists investigating vision worked with a cat They searched for an "object" to cause the neuron to fire Instead, they found neurons fired for specific features Example: a neuron fired for an edge if it was at a specific angle if it was at a specific location in the cat's field of view Theory: vision is the collection of lots of neurons each detecting specific features Neural Networks were born! Impact on Machine Learning Use layers of perceptrons Problem – computationally expensive! Problem – vanishing error gradients So Multi-Layer Neural Networks had to wait Recall the Perceptron Adjust Weights 1 w0 Error x1 w1 z Σ 𝜙(𝑧) x2 w2 xm wm Build a Neural Network Each activation unit performs a weighted sum of its inputs Each layer can have a different number of activation units One hidden layer is a Neural Network Multiple hidden layers make a Deep Neural Network input hidden output layer layer layer (#$) (() 𝑎! 𝑎! (#$) 𝑥! (#$) 𝑎& (() 𝑎& (*+,) 𝑎& 𝑦'! (#$) 𝑥& (#$) 𝑎' (() 𝑎' (*+,) 𝑎' 𝑦'& (#$) 𝑥$ (#$) 𝑎$ (() 𝑎) Addressing the Issues, Part 1 1. Each layer has equal variance on input and output This spreads out tightly coupled values 2. Use non-saturating activation functions Rectified Linear Exponential Unity Leaky Linear (ReLU) ReLU Unit -1 0 0 0 Saturated Leaky Addressing the Issues, Part 2 3. Batch Normalization Zero the center and normalize inputs to each layer 4. Gradient clipping to prevent ever-increasing values 5. Reuse pretrained layers from similar problems Significant active research! Many choices of optimizers: Gradient Descent, Nesterer Accelerated Gradient, Adadelta, RMSProp, Adam, Momentum, … Learning Rate Scheduling Allow large changes initially Reduce the learning rate on a schedule: 1. predetermined piecewise linear 2. performance 3. exponential 4. power Regularization Helps avoid overfitting by penalizing large weights Early stopping: stop when improvement slows Regularization: add a term to penalize large weights 𝜆 is the regularization parameter L1 regularization L2 regularization - , 𝜆 ∑, )*+ 𝑤). ∑)*+ 𝑤). L1 and L2 each have advantages and disadvantages! Dropout During training, randomly choose neuron outputs to drop between specific layers Use a probability p to decide Different choices made on each iteration Helps reduce overfitting Common value is p=0.5 Remaining weights are adjusted to reflect those dropped Very popular method Max-Norm Regularization Constrain the weights so that: 𝑤 ! ≤𝑟 Rather than just penalizing large weights, this penalizes overall weights Perform on the inputs to each neuron Data Augmentation Machine learning requires a lot of data! Generate additional training data by modifying samples: Shift Rotate Reflect Resize Crop Change color, brightness, contrast, blemishes This is NOT fake data! Especially useful processing images Model Zoos Publicly available deep learning models Download them and modify them for your application! Provide good examples and starting points Prizes Companies often give prizes for solving ML problems Even the government gives prizes sponsored by the Intelligence Advanced Research Projects Activity (IARPA) Python for Rapid Engineering Solutions Steve Millman Image Processing Welcome! Today’s Objectives: Perform convolution on an image Perform pooling on an image Image Processing: Convolution Map multiple pixel values onto a single pixel Used to emphasize different features in an image Example: 3x3 convolution Stride length Issues: Loss of data volume! Edge pixels aren't weighted as much This is "Valid" padding since only valid data used Full Padding Full padding starts at outside edge and result is larger Can use 0 or other values external to the image Same Padding Same padding starts centered so result is same size Convolution Example Many convolution functions are available Apply several functions to these photos This photo by Unknown Author is licensed under CC BY-SA This photo by andres_cadena84is licensed under CC PDM 1.0 Convolution Code: Set Up m7_conv.py: # "Cat eyes" by andres_cadena84 is licensed under CC PDM 1.0 # Zebra photo: This Photo by Unknown Author is licensed under CC BY-SA import cv2 # use to read the photo import numpy as np # needed for arrays CONV_DIM = 5 # size of convolution CONV_LOW = CONV_DIM//2 # middle of array CONV_REM = (CONV_DIM-CONV_LOW)//2 # needed for Sobel array creation CONV_HIG = 1+CONV_DIM//2 # distance from end when to stop FILE_CAT = '25537304817_4a55dc6092_b.jpg' FILE_ZEB = 'zebra.jpg' NAME_CAT = 'cat_' NAME_ZEB = 'zeb_' NUM_PHOTOS = 2 files = [FILE_CAT,FILE_ZEB] names = [NAME_CAT,NAME_ZEB] Ø python m7_conv.py Convolution Code: The Function m7_conv.py (continued): ################################################################################ # Function convolve performs convolution on an image # # inputs: # # orig: an array holding the original image # # dim_x,dim_y: the dimensions, in pixels, of the image # # conv: the array to use for the convolution # # factor: value to divide the result by, default value is 1 # # outputs: # # new_img - converted to an array # ################################################################################ def convolve(orig,dim_x,dim_y,conv,factor=1): new_img = [] # list of lists for x_index in range(CONV_LOW,dim_x-CONV_HIG): # for every pixel new_row = [] # list for each row for y_index in range(CONV_LOW,dim_y-CONV_HIG): # mult by conv convolution = conv * img[x_index-CONV_LOW:x_index+CONV_HIG, y_index-CONV_LOW:y_index+CONV_HIG] new_row.append(convolution.sum()/factor) # need to compensate? new_img.append(new_row) # add row to image return np.array(new_img) # return an array Convolution Code: Copy and Blur m7_conv.py (continued): conv_copy = np.zeros([CONV_DIM,CONV_DIM],int) # this one makes a copy conv_copy[CONV_LOW,CONV_LOW] = 1 print(conv_copy) conv_blur = np.ones([CONV_DIM,CONV_DIM],int) # this one blurs the image print(conv_blur) [[0 0 0 0 0] [0 0 0 0 0] [0 0 1 0 0] [0 0 0 0 0] [0 0 0 0 0]] [[1 1 1 1 1] [1 1 1 1 1] [1 1 1 1 1] [1 1 1 1 1] [1 1 1 1 1]] Convolution Code: Sobel and Laplacian m7_conv.py (continued): conv_soby = np.zeros([CONV_DIM,CONV_DIM],int) # this one accentuates conv_soby[0:CONV_REM] = -1 # horizontal edges conv_soby[0:CONV_REM,CONV_REM:CONV_DIM-CONV_REM] -= 1 conv_soby[-CONV_REM:] = -conv_soby[0:CONV_REM] print(conv_soby) conv_sobx = conv_soby.T # this one accentuates print(conv_sobx) # vertical edges conv_lapl = np.zeros([CONV_DIM,CONV_DIM],int) # this one accentuates conv_lapl[CONV_LOW,:] = 1 # both horizontal and conv_lapl[:,CONV_LOW] = 1 # vertical edges conv_lapl[CONV_LOW,CONV_LOW] = 2 + (-CONV_DIM * 2 ) print(conv_lapl) [[-1 -2 -2 -2 -1] [[-1 0 0 0 1] [[ 0 0 1 0 0] [ 0 0 0 0 0] [-2 0 0 0 2] [ 0 0 1 0 0] [ 0 0 0 0 0] [-2 0 0 0 2] [ 1 1 -8 1 1] [ 0 0 0 0 0] [-2 0 0 0 2] [ 0 0 1 0 0] [ 1 2 2 2 1]] [-1 0 0 0 1]] [ 0 0 1 0 0]] Convolution Code: Generate Images m7_conv.py (continued): for index in range(NUM_PHOTOS): img = cv2.imread(files[index],0) # 0 converts to grayscale root = names[index] x,y = img.shape print(x,y) cv2.imwrite(root+'out.jpg',img) # verify it looks right img_copy = convolve(img,x,y,conv_copy) # reproduce the original cv2.imwrite(root+'copy.jpg',img_copy) img_blur = convolve(img,x,y,conv_blur,CONV_DIM*CONV_DIM) # blur the image cv2.imwrite(root+'blur.jpg',img_blur) img_sharp = convolve(img,x,y,conv_copy) # to sharpen features img_sharp = (3*img_sharp) - (2*img_blur) cv2.imwrite(root+'sharp.jpg',img_sharp) img_soby = convolve(img,x,y,conv_soby) # look for horizontal features cv2.imwrite(root+'soby.jpg',img_soby) img_sobx = convolve(img,x,y,conv_sobx) # look for vertical features cv2.imwrite(root+'sobx.jpg',img_sobx) img_lapl = convolve(img,x,y,conv_lapl) # look for both cv2.imwrite(root+'lapl.jpg',img_lapl) 640 1024 1198 1798 Convert to Gray Scale This photo by Unknown Author is licensed under CC BY-SA This photo by andres_cadena84is licensed under CC PDM 1.0 Copy This photo by Unknown Author is licensed under CC BY-SA This photo by andres_cadena84is licensed under CC PDM 1.0 Blur Original Blurred This photo by Unknown Author is licensed under CC BY-SA This photo by andres_cadena84is licensed under CC PDM 1.0 Sharpen Original Sharpened This photo by Unknown Author is licensed under CC BY-SA This photo by andres_cadena84is licensed under CC PDM 1.0 Sobel Y Original Sobel Y This photo by Unknown Author is licensed under CC BY-SA This photo by andres_cadena84is licensed under CC PDM 1.0 Sobel X Original Sobel X This photo by Unknown Author is licensed under CC BY-SA This photo by andres_cadena84is licensed under CC PDM 1.0 Laplacian Original Laplacian This photo by Unknown Author is licensed under CC BY-SA This photo by andres_cadena84is licensed under CC PDM 1.0 Image Processing: Pooling Pooling is a way to subsample the image Max pooling: take the maximum value in the window Advantage: noise reduction Mean pooling: take the average value in the window Decrease size of features Increases efficiency Reduces overfitting Not used as much Preferred convolution with stride of 2 Pooling Code: Set Up m7_pool.py # "Cat eyes" by andres_cadena84 is licensed under CC PDM 1.0 # Zebra photo: This Photo by Unknown Author is licensed under CC BY-SA import cv2 # use to read the photo import numpy as np # needed for arrays POOL_DIM = 2 # size of stride FILE_CAT = '25537304817_4a55dc6092_b.jpg' FILE_ZEB = 'zebra.jpg' NAME_CAT = 'cat_' NAME_ZEB = 'zeb_' NUM_PHOTOS = 2 NUM_POOLS = 2 files = [FILE_CAT,FILE_ZEB] names = [NAME_CAT,NAME_ZEB] ptype = ['mean.jpg','max.jpg'] Pooling Code: Pooling Function m7_pool.py (continued): ################################################################################ # Function pool performs pooling on an image # # inputs: # # orig: an array holding the original image # # dim_x,dim_y: the dimensions, in pixels, of the image # # max: 1 do max pooling; 0 do mean pooling # # outputs: # # new_img - converted to an array # ################################################################################ def pool(orig,dim_x,dim_y,max=1): new_img = [] # list of lists for x_index in range(0,dim_x,POOL_DIM): # for each pixel, striding! new_row = [] # list for each row for y_index in range(0,dim_y,POOL_DIM): # for each group... if max: pool_val = img[x_index:x_index+POOL_DIM, y_index:y_index+POOL_DIM].max() else: pool_val = img[x_index:x_index+POOL_DIM, y_index:y_index+POOL_DIM].mean() new_row.append(pool_val) # add to list new_img.append(new_row) # add to image return np.array(new_img) # return an array Pooling Code: Generate Images m7_pool.py (continued): for index in range(NUM_PHOTOS): img = cv2.imread(files[index],0) # 0 converts to grayscale root = names[index] x,y = img.shape print(x,y) for pool_type in range(NUM_POOLS): img_pool = pool(img,x,y,pool_type) cv2.imwrite(root+ptype[pool_type],img_pool) Pooling the Zebra Original Mean Max Pooling the Cat Original Mean Max Python for Rapid Engineering Solutions Steve Millman Implementing a Convolutional Neural Network Welcome! Today’s Objectives: Examine Convolutional Neural Networks (CNNs) Implement a CNN Implementing a CNN Find features that distinguish classes Use convolution and pooling Feed this into a Multi-Layer Neural Network Fully 3x3 Convolution 2x2 Max Pooling Connected Flatten 3x3 Convolution 25% dropout 50% dropout 28x28 x1 28x28x1 28x28x32 28x28x64 14x14x64 10 12,544 128 Problem: Automate the Sorting of Mail Post office sorts mail by zip code Can this be automated? Can all these be recognized as 7? Method: Create a Grid Get samples of the numbers 0-9 Treat each sample as a 28x28 grid Assign a value to each square proportional to its content TensorFlow and Keras TensorFlow is available, but somewhat cumbersome Keras, built on top of TensorFlow, makes it easier to use TensorFlow and Keras are used in the Raschka text A solution using TensorFlow and Keras is included online m7_keras_mnist.py PyTorch Another extensive ecosystem for machine learning Used in other EEE classes Touted as more efficient than TensorFlow MNIST Code: Set Up m7_pytorch_mnist.py: import torch # import the various PyTorch packages import torch.nn as nn import torch.optim as optim import torch.nn.functional as F from torchvision import datasets # the data repository from torchvision import transforms # transforming the data ################################################ # Setting up constants for training ################################################ BATCH_SIZE = 128 # number of samples per gradient update NUM_CLASSES = 10 # how many classes to classify (10 digits, 0-9) EPOCHS = 2 # how many epochs to run trying to improve MNIST Code: The Class Initialization m7_pytorch_mnist.py (continued): ################################################ # Create the network ################################################ class MNIST_NET( nn.Module ): ################################################ # Initializing the network ################################################ def __init__( self, num_classes): super().__init__() self.conv1 = nn.Conv2d(in_channels=1, out_channels=32, kernel_size=(3,3)) self.conv2 = nn.Conv2d(in_channels=32, out_channels=64, kernel_size=(3,3)) self.mpool = nn.MaxPool2d( kernel_size=2 ) self.drop1 = nn.Dropout( p=0.25 ) self.flat = nn.Flatten() self.fc1 = nn.Linear( in_features=64 * 12 * 12, out_features=128 ) self.drop2 = nn.Dropout( p=0.5 ) self.fc2 = nn.Linear( in_features=128, out_features=num_classes ) MNIST Code: The Class Function m7_pytorch_mnist.py (continued): ################################################ # Forward pass of the network ################################################ def forward( self, x ): x = F.relu( self.conv1( x ) ) x = F.relu( self.conv2( x ) ) x = self.mpool( x ) x = self.drop1( x ) x = self.flat( x ) x = F.relu( self.fc1( x ) ) x = self.drop2( x ) x = self.fc2( x ) return x MNIST Code: Load and Transform the Data m7_pytorch_mnist.py (continued): ################################################### # steps required to transform the data: # 1. Convert to a tensor # 2. Transform with mean and standard deviation 0.5 # 3. Bundle the steps together ################################################### to_tensor = transforms.ToTensor() normalize = transforms.Normalize([0.5],[0.5]) transform = transforms.Compose( [ to_tensor, normalize ] ) ################################################### # load the training data and transform it # then get it into the pytorch environment # then do the same for the test data ################################################### trainset = datasets.MNIST('~/MNIST_data/train', download=True, train=True, transform=transform) trainloader = torch.utils.data.DataLoader(trainset, batch_size=BATCH_SIZE, shuffle=True) testset = datasets.MNIST('~/MNIST_data/test', download=True, train=False, transform=transform) testloader = torch.utils.data.DataLoader(testset, batch_size=BATCH_SIZE, shuffle=True) MNIST Code: Set Up the Object m7_pytorch_mnist.py (continued): ################################################ # Get ready to train ################################################ mnist_net = MNIST_NET( NUM_CLASSES ) # initialize the object criterion = nn.CrossEntropyLoss() # select cost function optimizer = optim.Adadelta( mnist_net.parameters() ) # select the optimizer MNIST Code: Train! m7_pytorch_mnist.py (continued): ################################################ # Training loop ################################################ num_batches = len(trainloader) for epoch in range( EPOCHS ): for batch_idx, (images, labels) in enumerate(trainloader): optimizer.zero_grad() # resets gradient optimizer output = mnist_net( images ) # calls the forward function loss = criterion( output, labels ) # calculates the errors loss.backward() # back propagates the changes optimizer.step() # updates the weights if batch_idx % 100 == 0: # report periodically batch_loss = loss.mean().item() print("Epoch {}/{}\tBatch {}/{}\tLoss: {}" \.format(epoch, EPOCHS, batch_idx, num_batches, batch_loss)) MNIST Code: Test! m7_pytorch_mnist.py (continued): ################################################ # Testing loop ################################################ num_correct = 0 # initialize the counters num_attempts = 0 for images, labels in testloader: # for each batch of images with torch.no_grad(): outputs = mnist_net( images ) # run the images through guesses = torch.argmax( outputs, 1) # most probable guess per image num_guess = len( guesses ) # how many in this batch num_right = torch.sum( labels == guesses ).item() # how many correct num_correct += num_right # track accuracy num_attempts += num_guess print("Total test accuracy:", 100*num_correct/num_attempts,"%") MNIST Code: Run! Ø python m7_pytorch_mnist.py Epoch 0/2 Batch 0/469 Loss: 2.3052358627319336 Epoch 0/2 Batch 100/469 Loss: 0.2719106376171112 Epoch 0/2 Batch 200/469 Loss: 0.09463059902191162 Epoch 0/2 Batch 300/469 Loss: 0.04825776815414429 Epoch 0/2 Batch 400/469 Loss: 0.04239816218614578 Epoch 1/2 Batch 0/469 Loss: 0.1446414291858673 Epoch 1/2 Batch 100/469 Loss: 0.061279844492673874 Epoch 1/2 Batch 200/469 Loss: 0.09219815582036972 Epoch 1/2 Batch 300/469 Loss: 0.04766023904085159 Epoch 1/2 Batch 400/469 Loss: 0.03350739926099777 Total test accuracy: 97.85 % Python for Rapid Engineering Solutions Steve Millman Generative Adversarial Networks Welcome! Today’s Objective: Introduction to Generative Adversarial Networks Generative Adversarial Network Goal: ability to generate images that are indistinguishable from real images Method: Use two machine learning models Model 1: Generator Based on given parameters, create images Model 2: Discriminator Decides if input is real or artificial The two models are adversaries! Use Case: Generate Faces Animators can draw perfect faces But people spot them as fakes immediately! Audiences find this very distracting. By making the characters look at least slightly fake, audiences are not distracted and enjoy the show. Example parameters for generating faces: hair color eye color ratio of width to height distance between eyes skin color Encoder – Decoder Pair An encoder takes a sample and maps it to a vector. The decoder takes that vector and recreates the sample. Ideally, the intermediate vector has fewer dimensions. Having trained the encoder, analyze the vectors to understand how the values are distributed. Use those distributions to seed the generator. Generator Tries to create images that will pass as real Input is a set of parameters with ranges and distributions From those, randomly select values Training "teaches" the generator to pick proper values The generator is not the same as the encoder, but their functionality is related Initial attempts are usually not recognizable as images The Discriminator Trained with real and artificial data The discriminator is initially poor at its job Over time, the generator and discriminator improve Very Difficult to Train GANs were first suggested in 2014 Getting good results is still difficult Mode Collapse: The generator focuses on a single representation Hot topic of current research Example Face Generation From projects done in AME598 Minds and Machines and EEE598 Computational Image Understanding & Pattern Analysis Deep Fakes Not just the ability to create fake photographs Current technology allows real-time video manipulation Example: politician gives a speech Bad actors are filmed giving a different speech Machine learning maps actor's mouth onto the politician's face Machine learning modifies the voice to sound like the politician's But, machine learning is also being used to spot deep fakes…