Introduction into Current Developments in Generative AI 2024 PDF
Document Details
Uploaded by Deleted User
2024
Lucas Schönhold
Tags
Summary
This document presents an introduction to current developments in Generative AI, including a time schedule and course details. It discusses topics such as generative models, discriminative modeling, and generative adversarial networks (GANs).
Full Transcript
Introduction into current developments in Generative AI Lucas Schönhold Trends of Artificial Intelligence in Business Informatics, WS 2024/25 1 Time schedule Trends of Artificial Intelligence in Bus...
Introduction into current developments in Generative AI Lucas Schönhold Trends of Artificial Intelligence in Business Informatics, WS 2024/25 1 Time schedule Trends of Artificial Intelligence in Business Informatics, WS 2024/25 2 Phase 2 of the Course Please enter your project preferences into the Etherpad by Friday, November 15th! Make sure you register for the exam (Deadline: December 13th)! HISinOne If you don’t have a group, we will put one together for you We will begin to send e-mails for the group assignments in the Block Week (CW 46) Trends of Artificial Intelligence in Business Informatics, WS 2024/25 3 Schedule Generative AI Generative Modeling Generative Adversarial Networks Deepfakes Generative Image manipulation Transformer Architecture ChatGPT Limitations of LLMs Retrieval Augmented Generation Trends of Artificial Intelligence in Business Informatics, WS 2024/25 4 Generative AI Trends of Artificial Intelligence in Business Informatics, WS 2024/25 5 “Generative modeling is a branch of machine learning that involves training a model to produce new data that is similar to a given dataset.“ --- D. Foster Trends of Artificial Intelligence in Business Informatics, WS 2024/25 6 Generative Models Probabilistic Have a random component => The model should mimic an unknown probability distribution and sample from it so that it looks like the sampled data has come from the original data Trends of Artificial Intelligence in Business Informatics, WS 2024/25 7 Discriminative Modeling Separating observations into different categories (by their labels) -> Decision Boundary After training, the models can discriminate (categorize) new data based on an observation The model can’t reproduce the data; it can only categorize A discriminative model computes whether an observation 𝒙 belongs to a category 𝑦: 𝑝(𝑦|𝒙) Trends of Artificial Intelligence in Business Informatics, WS 2024/25 8 Generative Modeling Producing new data that does not have to be categorized (Can be trained without labels) Estimating if a sample 𝑥 belongs to a probability distribution 𝑝(𝑥) ÞSampling from it generates new samples Can also be conditional (with labels): 𝑝(𝒙|𝑦) (Observation 𝑥 being associated with a label 𝑦) Trends of Artificial Intelligence in Business Informatics, WS 2024/25 9 The field of AI Most of AI is discriminative for two reasons: 1. We try by default to categorize data; an AI that helps speed up that process is more helpful 2. Building Generative AI is much more complicated because it requires a deeper understanding of the data For example, rating a solution is easier than coming up with one Trends of Artificial Intelligence in Business Informatics, WS 2024/25 10 The rise of Generative AI It was long believed that the problem of Generative AI could not be solved, as it would require AI to be creative, which was thought to be a feature only humans have. However, if we want to build a general-purpose AI, it should also include that feature Over the past 10 years, there has been tremendous progress in the field of Generative AI Trends of Artificial Intelligence in Business Informatics, WS 2024/25 11 Generative Modeling Trends of Artificial Intelligence in Business Informatics, WS 2024/25 12 Generative Modeling The goal of a generative Model is to mimic a probability distribution 𝑝!"#" The distribution of the model is 𝑝$%!&' Sampling from 𝑝$%!&' should produce data as close as possible, as if the data would have come from 𝑝!"#" Let’s see how we would do… Trends of Artificial Intelligence in Business Informatics, WS 2024/25 13 Dots were created by 𝑝!"#" Trends of Artificial Intelligence in Business Informatics, WS 2024/25 14 Where would you put a new dot to make it seem like 𝑝"#$# has generated it? How would 𝑝%&"'( look? Trends of Artificial Intelligence in Business Informatics, WS 2024/25 15 To generate new samples, you would sampling choose a point in 𝑝%&"'( What did we just do? Trends of Artificial Intelligence in Business Informatics, WS 2024/25 16 Generative Modeling Framework We assume unknown distribution 𝑝!"#" generated all observations We want to build 𝑝$%!&' that mimics 𝑝!"#" Therefore, 𝑝$%!&' has to have the following properties: Accuracy: 𝑝%&"'( should look like it mimics 𝑝"#$# Generation: It should be possible to easily sample from 𝑝%&"'( to generated new samples Representation: We should be able to see how features are represented by 𝑝%&"'( Trends of Artificial Intelligence in Business Informatics, WS 2024/25 17 The true distribution 𝑝"#$# is a distribution over the land mass of the world, with all points being on land Trends of Artificial Intelligence in Business Informatics, WS 2024/25 18 Generative Modeling Our model 𝑝$%!&' is a simplification of 𝑝!"#" Point A: could be generated by 𝑝$%!&' , but not by 𝑝!"#" , because it’s in the middle of the sea Point B: could never be generated by 𝑝$%!&' , as it lies outside of the orange box Point C: could be generated by 𝑝!"#" and 𝑝$%!&' Trends of Artificial Intelligence in Business Informatics, WS 2024/25 19 Generative Modeling It’s easier to sample from our model due to less complexity It’s a simple representation of the underlying, more complex distribution The actual distribution 𝑝!"#" is separate into multiple continents This is also true for our model 𝑝$%!&' , we just have one continent We’re still data-dependent! The presented approach is a simplified view on how to build generative models, but it also holds for more difficult problems Trends of Artificial Intelligence in Business Informatics, WS 2024/25 20 Generative Adversarial Networks Trends of Artificial Intelligence in Business Informatics, WS 2024/25 21 Generative Adversarial Networks The introduction of generative adversarial networks (GAN's), is considered a key turning point in the history of generative AI The core idea that was introduced has spawned some of the most successful and widely used generative models The basic idea is to train two neural networks in an adversarial setting, starting with a random noise input Trends of Artificial Intelligence in Business Informatics, WS 2024/25 22 Generative Adversarial Networks GAN's are made up of two neural networks: Generator (initialized with random noise) Discriminator (trained to distinguish whether a sample is real (from the training data), or fake) The generator's goal is to fool the discriminator The discriminator's goal is to continue to distinguish between real and fake data Both are trained simultaneously GAN's are a battle between the generator and discriminator, who effectively play a Zero-Sum- (minimax) game Trends of Artificial Intelligence in Business Informatics, WS 2024/25 23 GANs - Architecture In the following, we’re going to take a look at the architecture of GANs Images taken from GAN Lab GAN Lab is a tool for experimenting with Generative Adversarial Networks (GANs) GAN Lab is available at https://poloclub.github.io/ganlab/ Trends of Artificial Intelligence in Business Informatics, WS 2024/25 24 GANs - Architecture Trends of Artificial Intelligence in Business Informatics, WS 2024/25 25 GANs - Architecture The dynamics of the Architecture can be found in various generative AI settings With this approach, you can also train a model to create Deepfakes Deepfakes are synthetic media in which a person in an existing image or video is replaced with someone else's likeness Trends of Artificial Intelligence in Business Informatics, WS 2024/25 36 Deepfakes Use-Cases include: Entertainment: Used in movies, videos, and digital content creation for special effects and visual storytelling Social Engineering: Misuse includes creating forged videos for political, social, or personal deception Research and Development: Advancements in AI and media manipulation technologies Because of the potential for misuse, deepfakes have sparked a debate about the ethical implications of the technology and what can be done to counteract its negative impact Trends of Artificial Intelligence in Business Informatics, WS 2024/25 37 Drag Your GAN In August 2023, a group of researchers introduced Drag Your GAN It is an Interactive Point-based Tool based on Generative Image Models to manipulate images Trends of Artificial Intelligence in Business Informatics, WS 2024/25 39 Drag Your GAN The tool allows users to manipulate images by defining vectors on the image The vectors are then used to manipulate the image The tool is based on a GAN model based on the StyleGAN3 model Try it out on Huggingface Trends of Artificial Intelligence in Business Informatics, WS 2024/25 40 Transformer Architecture Trends of Artificial Intelligence in Business Informatics, WS 2024/25 41 Transformer Architecture Developed in 2017 by Vaswani et al. for natural language processing (NLP) tasks. Revolutionized NLP, enabling models like BERT, GPT, and T5 Enables parallel processing, leading to faster training Highly scalable can handle large datasets effectively Trends of Artificial Intelligence in Business Informatics, WS 2024/25 42 Transformer Architecture Consist of an Encoder/Decoder Architecture: Encoder: processes input and maps It to a representation Decoder: uses representation to generate output. Encoder / Decoder can be used separately Trends of Artificial Intelligence in Business Informatics, WS 2024/25 43 Transformer Architecture The key point is the Self-Attention Mechanism, which allows the model to focus on different parts of a sentence simultaneously Multi-head attention combines multiple attention heads to capture diverse relationships in the data. Includes Positional Encoding to capture word order Residual Connections and Normalization Layers for stabilizing training Trends of Artificial Intelligence in Business Informatics, WS 2024/25 44 Transformer Architecture Widely used in NLP tasks: machine translation, text generation, summarization Transformers are also applied in vision (Vision Transformers) Basis for groundbreaking models like ChatGPT, BERT, and DALL-E. Trends of Artificial Intelligence in Business Informatics, WS 2024/25 45 ChatGPT Chatbot developed by OpenAI, first released in November 2022 It is a GPT model (Generative Pre-trained Transformer), fine-tuned to conversational responses There is no official paper explaining how ChatGPT works, only a blog- post by OpenAI In addition to the pre-training of the GPT, OpenAI used reinforcement learning from human feedback (RLHF) to fine-tune the GPT-3.5 model to generate more human-like responses It is continuously optimized by including feedback from users Trends of Artificial Intelligence in Business Informatics, WS 2024/25 47 ChatGPT - Training Trends of Artificial Intelligence in Business Informatics, WS 2024/25 48 Scaling GPT The advanced capabilities from future versions of GPT, result from scaling: Parameters Training Data Multimodality (Understands more than text e.g. Images) Model Year Released Parameters Training Data GPT-1 2018 117 Million BooksCorpus (5GB) GPT-2 2019 1.5 Billion WebText (40GB) GPT-3 2020 175 Billion Diverse internet text (570GB+), Common Crawl GPT-4 2023 ~ 1 Trillion* Even more diverse and curated sources (internet, academic, multimodal) * Estimated model size, not confirmed by OpenAI Trends of Artificial Intelligence in Business Informatics, WS 2024/25 49 Limitations of GPT and other LLM's LLMs (Large Language Models) like GPT are very good at generating text, but they have some limitations While they are good at generating text, they often lack common sense and reasoning They are also prone to confirmation bias and stereotyping They can also be used to generate fake news and misinformation This is often the result of the training data used to train the model! Trends of Artificial Intelligence in Business Informatics, WS 2024/25 50 Human after all… Screenshot of the UI that OpenAI’s labelers used to create training data Trends of Artificial Intelligence in Business Informatics, WS 2024/25 51 Hallucinations in LLMs A “hallucination” is when the model generates information that sounds correct but is false or made up Hallucinations happen because LLMs don’t understand facts or truth; they only predict likely words based on patterns from their training data – even if those patterns lead to misinformation If an LLM is asked about a specific event it hasn’t seen before, it might “fill in” the gaps with plausible details that are actually incorrect. LLMs are only as good/bad as the data used => We’re still data-dependent! Trends of Artificial Intelligence in Business Informatics, WS 2024/25 52 Retrieval Augmented Generation Trends of Artificial Intelligence in Business Informatics, WS 2024/25 55 Retrieval Augmented Generation Combination of Search + Generation Access to External Knowledge Sources (e.g., databases, documents) Retrieves Relevant Information before generating responses Reduces Hallucinations by grounding answers in factual data Enhanced Accuracy with real-time, up-to-date information Imagine you were to get the task to explain Generative AI, but you have no idea what it is. How would you do it? Trends of Artificial Intelligence in Business Informatics, WS 2024/25 56 Retrieval To get information on a topic you would look up resources that describe the topic. Those can essentially be seen as documents containing that knowledge. You would start by searching the web, looking up books, research, scientific papers… ⇒ You are retrieving documents that contain the relevant knowledge Trends of Artificial Intelligence in Business Informatics, WS 2024/25 59 Retrieval During this process, you would intuitively decide what content is relevant to your search by trying to: Understand Concepts Categorize Information Relating to similar ideas This is called semantic indexing Trends of Artificial Intelligence in Business Informatics, WS 2024/25 60 Augmentation After finding the relevant information, you must read (process) the information. During this process, you would: Highlight important parts Discard irrelevant information Connecting different pieces of information to fill gaps in the knowledge Merging new information Trends of Artificial Intelligence in Business Informatics, WS 2024/25 61 Generation After gathering and augmenting the information, you must answer the question in your own words. During this, you would articulate the learned ideas, thereby: Paraphrase Summarize Add your insights to this data Now let's have a look on how we would implement this process… Trends of Artificial Intelligence in Business Informatics, WS 2024/25 62 Retrieval During the Retrieval of information, you could retrieve Data from: Databases Document Storage Web APIs … To make this process more effective and retrieve only relevant information (like humans do intuitively) , a technique called semantic indexing is applied Trends of Artificial Intelligence in Business Informatics, WS 2024/25 64 Semantic Indexing Semantic indexing organizes documents by their meaning and can capture concepts and deep relationships between words. A simple keyword-based search might miss out on important documents A query searching for “Generative AI” might miss relevant documents, as a keyword search would not relate Generative AI to: “neural networks that generate images” ⇒ Even though both relate to the same concept Trends of Artificial Intelligence in Business Informatics, WS 2024/25 65 Augmentation Merging the retrieved information with the internal knowledge of the model Preprocessing of input data Ranking and retrieving content from documents Linking of retrieved information with the internal knowledge of the model Trends of Artificial Intelligence in Business Informatics, WS 2024/25 66 Generation Augmented data is passed into an LLM Using prompt engineering, it tries to answer the initial Query based on the provided information To specify the results of the generation, a technique called prompt engineering is applied to get a more fitting answer based on the question. It is provided as additional context to the mode {"role": "system", "content": "You are an AI research assistant. You use a tone that is technical and scientific."}, Trends of Artificial Intelligence in Business Informatics, WS 2024/25 67 Naïve RAG 1. Documents getting indexed 2. User enters query 3. Relevant Documents get retrieved 4. Documents get Augmented to be prepared for generation 5. Documents and Query getting entered into a frozen LLM (model does not change), with additional prompt engineering Trends of Artificial Intelligence in Business Informatics, WS 2024/25 68 Advanced RAG RAG systems are very modular and can be extended to various degrees Adding more steps in terms of preprocessing (e.g. , summarization or reranking retrieved documents) can enhance the output Trends of Artificial Intelligence in Business Informatics, WS 2024/25 69 RAG Step Human RAG Retrieval Looking up information from articles, Querying a database or document store for papers, or websites relevant text Augmentation Understanding, synthesizing, and Preprocessing and ranking retrieved data; filtering relevant info integrating with model’s knowledge Generation Writing a cohesive answer using Using a language model to generate a coherent learned knowledge response based on the augmented info => The primary difference between humans and RAG systems is speed and scale. Trends of Artificial Intelligence in Business Informatics, WS 2024/25 70 Summary Generative AI produces new data that is similar to a given dataset The goal of a generative Model is to mimic a probability distribution 𝑝!"#" by producing a probability distribution 𝑝$%!&' (𝑝$%!&' is a simplification over 𝑝!"#" ) GANs combine the approach of generative learning with discriminative learning (can be used to make deepfakes) The Transformer Architecture enabled LLMs using the self-attention mechanism LLMs such as GPT scale with more parameters & more data Training of ChatGPT combines unsupervised-, supervised- and reinforcement Learning Hallucinations in LLMs: Predict the next possible word from patterns in the training data to fill in gaps in the knowledge LLMs are only as good/bad as the training data RAG Systems reduce Hallucinations by grounding answers in factual data Trends of Artificial Intelligence in Business Informatics, WS 2024/25 71 References L. Ouyang et al., “Training language models to follow instructions with human feedback,” Mar. 04, 2022, arXiv: arXiv:2203.02155. Accessed: Nov. 07, 2024. [Online]. Available: http://arxiv.org/abs/2203.02155 OpenAI, “Introducing ChatGPT,” Introducing ChatGPT. Accessed: Nov. 07, 2024. [Online]. Available: https://openai.com/index/chatgpt/ A. Radford, J. Wu, R. Child, D. Luan, D. Amodei, and I. Sutskever, “Language Models are Unsupervised Multitask Learners,” 2019, Accessed: Nov. 06, 2024. [Online]. Available: https://cdn.openai.com/better-language- models/language_models_are_unsupervised_multitask_learners.pdf A. Radford, K. Narasimhan, T. Salimans, and I. Sutskever, “Improving Language Understanding by Generative Pre-Training,” 2018, Accessed: Nov. 06, 2024. [Online]. Available: https://cdn.openai.com/research-covers/language- unsupervised/language_understanding_paper.pdf T. B. Brown et al., “Language Models are Few-Shot Learners,” Jul. 22, 2020, arXiv: arXiv:2005.14165. Accessed: Nov. 06, 2024. [Online]. Available: http://arxiv.org/abs/2005.14165 Trends of Artificial Intelligence in Business Informatics, WS 2024/25 72 References OpenAI et al., “GPT-4 Technical Report,” Mar. 04, 2024, arXiv: arXiv:2303.08774. Accessed: Nov. 06, 2024. [Online]. Available: http://arxiv.org/abs/2303.08774 X. Pan, A. Tewari, T. Leimkühler, L. Liu, A. Meka, and C. Theobalt, “Drag Your GAN: Interactive Point-based Manipulation on the Generative Image Manifold,” in Special Interest Group on Computer Graphics and Interactive Techniques Conference Conference Proceedings, Jul. 2023, pp. 1–11. doi: 10.1145/3588432.3591500. D. Foster and K. J. Friston, Generative deep learning: teaching machines to paint, write, compose, and play, Second edition. Beijing ; Boston: O’Reilly, 2023. Y. Gao et al., “Retrieval-Augmented Generation for Large Language Models: A Survey,” Mar. 27, 2024, arXiv: arXiv:2312.10997. Accessed: Oct. 25, 2024. [Online]. Available: http://arxiv.org/abs/2312.10997 A. Vaswani et al., “Attention Is All You Need,” Aug. 01, 2023, arXiv: arXiv:1706.03762. doi: 10.48550/arXiv.1706.03762. Trends of Artificial Intelligence in Business Informatics, WS 2024/25 73