Week 14 - COE305 Machine Learning PDF

Summary

These notes cover week 14 of a machine learning course (COE305). The content focuses on neural networks, optimization algorithms, and convolutional neural networks (CNNs). Specific topics like optimizers, layers, and applications of CNNs are detailed.

Full Transcript

WEEK 14 BUILDING NEURAL NETWORKS 1/8/2025 1 MULTILAYER PERCEPTRON 1/8/2025 2 MULTILAYER PERCEPTRON with weights Goal of backpropagation is to adjust the weights so that the neural networks...

WEEK 14 BUILDING NEURAL NETWORKS 1/8/2025 1 MULTILAYER PERCEPTRON 1/8/2025 2 MULTILAYER PERCEPTRON with weights Goal of backpropagation is to adjust the weights so that the neural networks produce the optimized output. 1/8/2025 3 Optimizers An optimizer in machine learning is an algorithm or method used to adjust the weights (or parameters) of a neural network during training in order to minimize the loss function. The goal of the optimizer is to improve the performance of the model by finding the set of weights that result in the smallest possible error between the model's predictions and the actual data. 1/8/2025 4 Types of optimizers Gradient Descent Stochastic Gradient Descent (SGD) Mini-batch SGD SGD with Momentum Adagrad Adadelta & RMSProp Adam 1/8/2025 5 Choosing an Optimizer: Simple Models: SGD or Momentum may suffice. Deep Networks: Adam is often a good choice due to its adaptability. Specific Tasks: Try different optimizers and tune hyperparameters for best results. 1/8/2025 6 Loss function A mathematical function that measures the difference between the predicted output of a machine learning model and the actual target values (ground truth). It acts as a guide for the optimization process by indicating how well the model is performing. The optimizer uses the gradients of the loss function to adjust the weights in the direction that minimizes the loss. Example Mean Squared Error (MSE) Binary Cross-Entropy Categorical Cross Entropy 1/8/2025 7 Introduction to CNN Yann LeCun, director of Facebook’s AI Research Group, is the pioneer of convolutional neural networks. He built the first convolutional neural network called LeNet in 1988. LeNet was used for character recognition. convolutional neural network (CNN) is a type of artificial neural network used in image recognition and processing that is specifically designed to process pixel data. 1/8/2025 8 Applications of CNN 1/8/2025 9 Layers in a Convolutional Neural Network 1. Convolution layer 2. Activation layer 3. Pooling layer 4. Fully connected layer 1/8/2025 10 Convolution layer – b/w image representation In CNN, every image is represented in the form of an array of pixel values. 1/8/2025 11 Convolution layer – color image representation 1/8/2025 12 CONVOLUTION LAYER Example: Image Matrix multiplies Filter Matrix Output Matrix 1x1 + 1X0 + 1X1 + 0X0 + 1X1 + 1X0 + 0X1 + 0X0 +1X1= 4 Convolution of an image with different filters can perform operations such as edge detection, blur and sharpen by applying filters. 1/8/2025 13 CONVOLUTION LAYER Example: Image Matrix multiplies Filter Matrix Output Matrix Parameters: Kernel Size: Dimensions of the filter. Stride: The step size when sliding the filter over the input. Padding: Adding extra borders around the input to control the spatial size of the output. 1/8/2025 14 Activation Layers Introduces non-linearity to help the network learn complex patterns. Common Activation Functions: ReLU (Rectified Linear Unit): f(x)=max(0,x) Sigmoid: Compresses values between 0 and 1. Tanh: Compresses values between -1 and 1. 1/8/2025 15 Pooling layer Convolutional layers in a convolutional neural network summarize the presence of features in an input image. A problem with the output feature maps is that they are sensitive to the location of the features in the input. One solution – down sampling. Pooling layers provide an approach to down sampling feature maps Types of pooling For each channel pooling will be done and stacked. Fully connected layer Case study - LeNet-5 Architecture LENET-5 Yann LeCun and team proposed a neural network architecture for handwritten and machine-printed character recognition in 1990’s LeNet-5 architecture consists of two sets of convolutional and average pooling layers, followed by a flattening convolutional layer, then two fully-connected layers and finally a softmax classifier. Reference: http://yann.lecun.com/exdb/publis/pdf/lecun-01a.pdf Summary of lenet-5 architecture Variants of CNN Classic Variants LeNet: Introduced by Yann LeCun. Designed for handwritten digit recognition (e.g., MNIST dataset). Simple architecture with convolutional layers followed by fully connected layers. AlexNet: Proposed by Alex Krizhevsky. Key breakthrough in image recognition (ImageNet competition, 2012). Uses ReLU activations, dropout, and overlapping max pooling. VGGNet: Known for using very deep networks with small 3×3 convolutional filters. Variants include VGG16 and VGG19 (16 and 19 layers, respectively). 1/9/2025 21 Variants of CNN Feature Extraction-Focused Variants GoogLeNet (Inception Network): Utilizes "Inception modules" that perform convolutions with multiple filter sizes in parallel. Reduces computational cost with 1×1 convolutions. Variants: Inception-v1 to Inception-v4. ResNet (Residual Networks): Introduces "skip connections" or "residual blocks" to avoid vanishing gradients in deep networks. Variants: ResNet-50, ResNet-101, etc. Basis for many modern architectures. DenseNet: Features dense connections between layers to reuse features and improve gradient flow. Variants: DenseNet-121, DenseNet-169, etc. 1/9/2025 22 Variants of CNN Lightweight and Efficient Variants MobileNet: Designed for mobile and embedded devices. Uses depthwise separable convolutions for efficiency. Variants: MobileNetV1, MobileNetV2, MobileNetV3. EfficientNet: Balances model depth, width, and resolution to optimize performance. Uses a compound scaling approach (EfficientNet-B0 to EfficientNet-B7). 1/9/2025 23 Variants of CNN Region-Based and Object Detection Variants R-CNN (Region-CNN): Extracts region proposals for object detection. Variants: Fast R-CNN, Faster R-CNN, Mask R-CNN (adds instance segmentation). YOLO (You Only Look Once): Real-time object detection. Variants: YOLOv3, YOLOv4, YOLOv5, etc. SSD (Single Shot Detector): Performs object detection and classification in a single forward pass. 1/9/2025 24 Variants of CNN Task-Specific Variants U-Net: Designed for medical image segmentation. Uses an encoder-decoder structure with skip connections. PSPNet (Pyramid Scene Parsing Network): Handles semantic segmentation with global context information. FCN (Fully Convolutional Networks): Fully convolutional layers for pixel-wise predictions (e.g., segmentation tasks). associates a label Transfer learning is a machine learning technique in which knowledge gained or category with through one task or dataset is used to improve model performance on another every pixel in an related task and/or different dataset. 1. In other words, transfer learning uses image. what has been learned in one setting to improve generalization in another setting. 1/9/2025 25 Advantages of CNNs: 1. Good at detecting patterns and features in images, videos, and audio signals. 2. Robust to translation, rotation, and scaling invariance. 3. End-to-end training, no need for manual feature extraction. 4. Can handle large amounts of data and achieve high accuracy. 1/8/2025 26 Disadvantages of CNNs: 1. Computationally expensive to train and require a lot of memory. 2. Can be prone to overfitting if not enough data or proper regularization is used. 3. Requires large amounts of labeled data. 4. Interpretability is limited, it’s hard to understand what the network has learned. 1/8/2025 27

Use Quizgecko on...
Browser
Browser