Chapter 3 Introduction to AI, Machine Learning, Deep Learning, and Large Language Models (LLMs).pdf
Document Details
Uploaded by PeerlessCalcium1929
H. Lavity Stoutt Community College
Full Transcript
Chapter 3: Introduction to AI, Machine Learning, Deep Learning, and Large Language Models (LLMs) Comprehensive Overview for Understanding Modern AI Technologies Presented by: Dr. Labed Abdeldjalil Dr. Labed Abdeldjalil What is Artificial Intelligence? AI refers...
Chapter 3: Introduction to AI, Machine Learning, Deep Learning, and Large Language Models (LLMs) Comprehensive Overview for Understanding Modern AI Technologies Presented by: Dr. Labed Abdeldjalil Dr. Labed Abdeldjalil What is Artificial Intelligence? AI refers to the simulation of human intelligence in machines designed to think and act like humans. ▪ Key Characteristics: Learning from data Reasoning and problem-solving Perception and language understanding ▪ Key Areas of AI: ML Computer Vision Robotics Natural Language Processing (NLP) Expert Robotics NLP Systems Expert Systems Dr. Labed Abdeldjalil 2 History of AI Dr. Labed Abdeldjalil 3 Types of AI ▪ Narrow AI: Designed for specific tasks ▪ General AI: Can perform any intellectual task a human can ▪ Strong AI vs. Weak AI: Strong AI: True understanding and consciousness Weak AI: Simulates intelligence without true understanding Dr. Labed Abdeldjalil 4 Applications of AI ▪ Healthcare: Diagnostic tools, robotic surgery ▪ Finance: Fraud detection, algorithmic trading ▪ Manufacturing: Predictive maintenance, automation ▪ Transportation: Autonomous vehicles ▪ Everyday Examples: Virtual assistants, recommendation engines ▪ …..etc. Dr. Labed Abdeldjalil 5 What is Machine Learning? ▪ A subset of AI focused on enabling machines to learn from data ▪ Difference between AI and Machine Learning: AI is the broader concept of machines performing tasks intelligently. Machine Learning is a specific approach within AI that uses data-driven learning. Dr. Labed Abdeldjalil 6 Machine Learning Dr. Labed Abdeldjalil 7 Machine Learning (WHY IS IT HARD) Dr. Labed Abdeldjalil 8 Categories of Machine Learning Unsupervised Learning: Supervised Learning: Learning Finding hidden patterns in from labeled data unlabeled data Machine Learning Types Semi-Supervised Learning: Reinforcement Learning: Combining a small amount of Learning through actions labeled data with a large and rewards amount of unlabeled data Dr. Labed Abdeldjalil 9 Supervised Learning: Overview ▪ A type of machine learning where the model is trained on labeled data. ▪ Labelled Datasets: Consist of input-output pairs where the output (label) is known. ▪ Process: Training phase: Model learns from the data. Prediction phase: Model makes predictions on new, unseen data. Dr. Labed Abdeldjalil 10 Supervised Learning Algorithms (Optional) ▪ Linear Regression: Predicts a continuous output based on input features. ▪ Decision Trees: Splits data into branches to make decisions or predictions. ▪ Support Vector Machines (SVM): Finds the optimal boundary that separates different classes. ▪ k-Nearest Neighbors (k-NN): Classifies data points based on the majority label of their nearest neighbors. ▪ Neural Network: Mimics the brain to recognize patterns and learn from data. Dr. Labed Abdeldjalil 11 Use Cases of Supervised Learning ▪ Image Classification: Identifying objects within images. ▪ Spam Detection: Classifying emails as spam or legitimate. ▪ Medical Diagnosis: Predicting diseases based on patient data. ▪ Financial Forecasting: Predicting stock prices or market trends. Dr. Labed Abdeldjalil 12 Unsupervised Learning: Overview ▪ A type of machine learning that deals with unlabeled data. ▪ Unlabeled Datasets: Data without explicit output labels. ▪ Objectives: Discover underlying structures, patterns, or distributions in data. Dr. Labed Abdeldjalil 13 Unsupervised Learning Algorithms (Optional) ▪ K-Means: Partitions data into k distinct clusters. ▪ DBSCAN: Identifies clusters based on density. ▪ Principal Component Analysis (PCA): Reduces the number of features while preserving variance. ▪ t-SNE: Reduces dimensions for high- dimensional data visualization. Dr. Labed Abdeldjalil 14 Use Cases of Unsupervised Learning ▪ Market Segmentation: Grouping customers based on purchasing behavior. ▪ Anomaly Detection: Identifying unusual patterns that may indicate fraud or system failures. ▪ Recommendation Systems: Suggesting products or content based on user behavior patterns. ▪ Image Compression: Reducing the size of images without significant loss of quality. ▪ Genomics: Discovering gene expression patterns. Dr. Labed Abdeldjalil 15 Semi-Supervised Learning: Overview ▪ A type of machine learning that Combines a small amount of labeled data with a large amount of unlabeled data. ▪ Rationale: Labeled data is often expensive or time-consuming to obtain. ▪ Benefits: More cost-effective than purely supervised learning, with improved accuracy. Dr. Labed Abdeldjalil 16 Applications of Semi-Supervised Learning ▪ When Labelled Data is Scarce ▪ Text Categorization ▪ Image Recognition ▪ Speech Recognition ▪ Bioinformatics: Protein classification with limited labeled samples. Dr. Labed Abdeldjalil 17 Reinforcement Learning: Overview ▪ A type of machine learning where an agent learns to make decisions by performing actions and receiving feedback in the form of rewards or penalties. ▪ Core Concepts: Agent, Environment, Actions, Rewards, Policy. ▪ Learning Process: Trial and error to maximize cumulative rewards. Dr. Labed Abdeldjalil 18 Reinforcement Learning Algorithms (Optional) ▪ Q-Learning: Model-free algorithm that learns the value of actions in states. ▪ Deep Q Networks (DQN): Combines Q- Learning with deep neural networks for high-dimensional state spaces. ▪ Policy Gradient Methods: Directly optimizes the policy. ▪ Actor-Critic Methods: Combines value- based and policy-based approaches. Dr. Labed Abdeldjalil 19 Applications of Reinforcement Learning ▪ Robotics: Teaching robots to perform complex tasks like grasping or walking. ▪ Game AI: AlphaGo, OpenAI Five for Dota 2. ▪ Autonomous Systems: Self-driving cars. ▪ Finance: Algorithmic trading strategies. ▪ Healthcare: Personalized treatment strategies. Dr. Labed Abdeldjalil 20 What is Deep Learning? ▪ A subset of machine learning that uses multi-layered neural networks to model complex patterns in data. ▪ Relation to Machine Learning: Deep learning is a specialized approach within machine learning. ▪ Key Features: Automatic feature extraction, scalability with large datasets. Dr. Labed Abdeldjalil 21 Types of Neural Networks (Optional) ▪ Convolutional Neural Networks (CNNs): Designed for image recognition. ▪ Recurrent Neural Networks (RNNs): Designed for sequential data, such as language modeling. ▪ Generative Adversarial Networks (GANs): Comprises a generator and a discriminator. ▪ Transformers: Utilizes self-attention mechanisms for NLP tasks. Dr. Labed Abdeldjalil 22 Applications of Deep Learning ▪ Image Recognition: Facebook’s facial recognition. ▪ Natural Language Processing: Google Translate. ▪ Speech Recognition: Amazon Alexa, Apple Siri. ▪ Autonomous Vehicles: Tesla’s Autopilot. ▪ Healthcare: AI-driven diagnostic tools. ▪ EVERYWHERE Dr. Labed Abdeldjalil 23 Introduction to Large Language Models (LLMs) ▪ LLMs Definition: LLMs are AI models capable of understanding and generating human-like text by leveraging massive amounts of data. ▪ Importance: LLMs form the backbone of many AI-driven applications, from chatbots to content creation. ▪ Transformers: LLMs are built on a neural network architecture called Transformers. Dr. Labed Abdeldjalil 24 Popular Large Language Models ▪ GPT Series: Developed by OpenAI, used for text generation, translation, and summarization. ▪ Gemini: Developed by Google, used for context understanding in both directions. ▪ LLama3: Developed by Meta, open-source AI model you can fine-tune, distill and deploy anywhere. ▪ Other Notable Models: XLNet, RoBERTa, ERNIE. Dr. Labed Abdeldjalil 25 Transformer Architecture Overview ▪ Overview of the Transformer Model: Introduced in 2017 by Vaswani et al. ▪ Self-Attention Mechanism: The model can focus on different parts of the input text at once. ▪ Positional Encoding: The model differentiates word order with positional encoding. Dr. Labed Abdeldjalil 26 Self-Attention Mechanism ▪ Self-attention lets the model weigh the importance of each word in a sentence. (It’s like giving attention to the words that matter the most) ▪ Example: In the sentence 'The cat sat on the mat, and it was happy,' self-attention helps the model understand 'it' refers to 'the cat.’ and the word sat might give more attention attention to the cat more then mat because cat is more directly related to the action of sitting ▪ Importance: It handles complex language tasks like translation and summarization. Dr. Labed Abdeldjalil 27 Positional Encoding ▪ Purpose: Provides information about the relative position of tokens in a sentence. (Tokens can be individual words, partial words, or even special characters such as punctuation) ▪ How It Works: A vector is added to each token’s embedding to encode its position. ▪ Example: Positional encoding helps the model differentiate between words in context. Dr. Labed Abdeldjalil 28 LLM Training Phases ▪ Pre-training and Fine-tuning: Two- phase process. ▪ Pre-training: The model learns general language patterns from vast corpora. ▪ Fine-tuning: The model is refined for specific tasks with specialized datasets. Dr. Labed Abdeldjalil 29 Pre-training Process ▪ Massive Dataset Exposure: The model is trained on vast text data. ▪ Tokenization: The LLM breaks sentences into tokens (words, sub-words, etc.). ▪ Embedding: Each token is transformed into a numerical vector. Dr. Labed Abdeldjalil 30 Fine-tuning Process ▪ Specialized Training: Fine-tuning narrows the LLM's focus onto a specific area. ▪ Labeled Data: Fine-tuning uses labeled input-output pairs. ▪ Output Refinement: The model practices generating outputs for specific tasks. Dr. Labed Abdeldjalil 31 Importance of the Context Window ▪ The context window is the amount of text the model can process at one time. ▪ Coherence and Relevance: Larger context windows maintain longer interactions. However, there are trade-offs, as larger context windows require more computational power and memory to process ▪ Comparison: Claude 2 has a context window of 100,000 tokens, GPT-4-turbo extends to 128,000 tokens. ▪ In English, the average number of words per 1,000 tokens is around 750. Researchers are predicting models with one million-plus tokens by 2024. Dr. Labed Abdeldjalil 32 Ethics in AI ▪ Bias in AI Models: Training data, algorithm design, societal biases. ▪ Privacy Concerns: Data collection and usage. ▪ AI Fairness: Ensuring equitable treatment across demographic groups. ▪ Transparency and Explainability: Making AI decisions understandable to humans. Dr. Labed Abdeldjalil 33