EU AI Act & AI Regulation - Augmented LLM

1 EU AI Act After three years (2021), EU approved the AI Act (March 2024). The law entered into force in May 2024 within EU. Some AI uses will get banned later AI use cases that pose a high risk to people’s fundamental rights, such as in healthcare, education, and policing. It will be more obvious when you’re interacting with an AI system Tech companies will be required to label deepfakes and AI-generated content and notify people when they are interacting with a chatbot or other AI system. 2 EU AI Act Citizens can complain if they have been harmed by an AI new European AI Office to coordinate compliance, implementation, and enforcement. AI companies will need to be more transparent AI companies developing technologies in “high risk” sectors, such as critical infrastructure or healthcare. AI companies developing large language models need to create and keep technical documentation showing how they built the model, how they respect copyright law, and publish a publicly available summary of what training data went into training the AI model. free open-source AI models that share every detail of how the model was built, including the model’s architecture, parameters, and weights, are exempt from 3 many of the obligations of the AI Act. UN AI Regulation On March 21, 2024, UN adopted first global AI resolution! Proposed by the US and co-sponsored by China and over 120 other nations. Encourages countries to safeguard human rights, protect personal data, and monitor AI for risks. Nonbinding but still important. "The improper or malicious design, development, deployment and use of artificial intelligence systems... pose risks that could... undercut the protection, promotion and enjoyment of human rights and fundamental freedoms," https://documents.un.org/doc/undoc/ltd/n24/065/92/pdf/n2406592.pdf?token=ZU7FDFIX3Vv8 mo1Ggx&fe=true 4 AIDA Canada’s AI Act Artificial Intelligence and Data Act (AIDA) Proposed in June 2022 by Government of Canada “ensure the development of responsible AI in Canada, and to prominently position Canadian firms and values in global AI development.” AIDA’s proposed regulations will largely apply to what it calls “high- impact AI systems”, essentially the same concept as the EU AI Act’s “high-risk” category. Not yet been passed, but coming soon… 5 Reflection and Discussion What do you think about such AI laws and regulations? Pros/Cons? For/Against? 6 Lack of Factuality, Reliability of Results Model outputs are fluent but sometimes they hallucinate: wrong, toxic, or otherwise undesirable. Solutions? Lack of Factuality, Reliability of Results Model outputs are fluent but sometimes they hallucinate: wrong, toxic, or otherwise undesirable. Solutions? ○ “Cite your sources” through retrieval-based models (e.g. GPT-o1, Bing search, perplexity.ai) ○ Getting models to “know what they know” through calibration ○ Provide better Context Lack of Robustness Models are less effective when tried on new applications, domains, and languages. Solutions? ○ Methods for engineering/prompting on individual tasks ○ Training special-purpose models for domains (e.g. science: Meta Galactica, medicine: Google MedPaLM, finance: BloombergGPT) ○ Fine-tuning on domain-specific datasets Introducion to LLM APIs Few LLM APIs OpenAI APIs ○ Models available: GPT-4o, GPT-4o-mini ○ Tasks: Completion, fine-tuning, function calling Anthropic ○ Models available: Claude 3 ○ Tasks: Completion. AWS Bedrock ○ Models available: Titan, Llama, and many more OpenAI ChatGPT API calls Install OpenAI library API calls are as easy as Get an OpenAI API key invoking a REST API secret_key = 'copy and paste your key here' import openai # Set up your OpenAI API credentials openai.api_key = secret_key # output = openai.Model.list() output = openai.Completion.create( model="text-curie-001", prompt="tel me a joke", ) 12 print(output) Introduction to Running LLMs Locally Running LLMs Locally There are open-source ecosystems offering collections of chatbot LLMs similar to ChatGPT, which can be run locally on your computer. Why? ○ Offline Mode ○ Privacy and Security ○ Cost GPT4All GPT4All is an LLM that you can install on your computer Try different models locally with UI setting Start With GPT-J Libraries to Develop LLM Applications Popular libraries for LLM Ecosystem LangChain ○ Popular Python library for chaining multiple generative models together Streamlit ○ For building ChatGPT like web interface Hugging Face ○ Python tools for pre-processing, training, fine- tuning, and deployment Pinecone, Chroma DB, Milvus ○ Vector databases AI Feature A6 You will need to integrate an AI-based feature into your web application. Ǫuestions? 19 Augmenting Large Language Models There’s a lot language models don’t know 2 What (base) LLMs are good at What they need help with Language understanding Up-to-date knowledge Instruction following Knowledge of your data Basic reasoning More challenging reasoning Code understanding Interacting with the world LLMs are for general reasoning, not specific knowledge LLM 2 A baseline: using the context window 2 Context is the way t o give LLM unique, up-to-date informati on… … B u t i t only fits a limited amount of information How much information can you f it in the context window? 7 128k tokens 7 128k tokens GPT4 Turbo 128k tokens Feb 2024 7 How much information can you f it in the context window? Number of 50 500 4,000 32,000 256,000 2,048,000 8,192,000 65,536,000 tokens Example GPT- 1 GPT- 3.5 GPT- 4 -32K model 8 How much information can you f it in the context window? Number of 50 500 4,000 32,000 256,000 2,048,000 8,192,000 65,536,000 tokens Example GPT- 1 GPT- 3.5 GPT- 4 -32K GPT- 4 -12 8K- model Turbo ~7,500 emails 30 seconds ~500Mb of worth of unicode text (or, about 1 tw eets data ~4 How much is New Yorker College year’s worth A sentence paragraphs Novel it? article thesis for a (At 40 tokens (a single of writing per tweet, ElasticSearch productive office ~400000 / node can worker) minute) store 50gb) 8 Context windows are growing fast, bu t won’t f i t everything for a while (Plus, more context = more $$$) This lecture: how t o make the most of a limited context by augmenting the language model Augmented language models Retrieval Chains Tools Augment with Augment with Augment with a bigger corpus more LLM calls outside sources 11 01 RAG: Retrieval Augmentation Generation Retrieval augmentation: outline A. Why retrieval augmentation? B. Traditional information retrieval C. Information retrieval via embeddings D. Patterns and case studies READ this article: https://www.marktechpost.com/2024/04/01/evolution-of-rags-naive-rag-advanced- rag-and-modular-rag-architectures/ 13 A. Why retrieval augmentation? Say we want our model t o have access t o user data Approach 1: put it in the context 38 What if we have thousands of users? 39 What if we have thousands of users? Use rules to figure out which users should go in the context - Most recent users? - Users mentioned in the query? - Most viewed users? What happens if the relationship is hard to write rules for? 40 Context-building is information retrieval 41 B. Traditional information retrieval Information retrieval basics Ǫuery. Formal statement of your information need, e.g., a search string. Object. Entity inside your content collection, e.g., a document. Relevance. Measure of how well an object satisfies the information need. Ranking. Ordering of relevant results based on desirability. 43 Traditional information retrieval: search via inverted indexes https://www.elastic.co/blog/found -elasticsearch-from-the-bottom-up 44 Ranking & relevance in traditional search Relevance via boolean search - E.g., only return the docs that contain: simple AND rest AND apis AND distributed AND nature Ranking via BM25. Affected by 3 factors - Term frequency (TF) — More appearances of search ter m = more relevant object - Inverse document frequency (IDF) — More objects containing search ter m = less important search ter m - Field length — If a document contains a search term in a field that is very short (i.e. has few words), it is more likely relevant than a document that contains a search ter m in a field that is very long (i.e. has many words). https://www.e lastic.co/blog/found -elasticsearch-from-the-bottom-up 45 Search engines are more than inverted indices Document ingestion Document processing (e.g., remove stop words, lower case, etc) Transaction handling (adding / deleting documents, merging index files) Scaling via shards Ranking & relevance Etc https://www.elastic.co/blog/found -elasticsearch-from-the-bottom-up 46 Limitations of “sparse” traditional search Only models simple word frequencies Doesn’t capture semantic information, correlation information, etc E.g., searching for “what is the top hand in bridge” might return documents about https://www.microsoft.com/en-us/research/video/research-talk-system-frontiers-for-dense-retrieval/ 47 C. AI-powered information retrieval via embeddings Search and AI make each other better Better information Search in the context Better representations of data AI (embeddings) 25 All about embeddings AI-powered retrieval Embedding relevance and indexes via embeddings Embedding databases Beyond naive nearest neighbor Embeddings are an abstract, dense, compact, xed-size, (usually) learned representation of data “Sparse” representation “Dense” representation Contains “coffee” 0.1231 Contains “tea” 0.7412 … … Contains “laptop” 0.6221 51 What are embeddings? “Embeddings are (learned) transformations to make data more useful” The shortest definition of embeddings? (https://roycoding.com/blog/2022/embeddings.html ) 52 What are embeddings not? Embeddings are not restricted to one modality of data. A single embedding doesn’t have to refer to one single type of input: “deep multimodal embeddings” Embeddings don’t have to be from a neural network; nowadays most are though! … What is an embedding, anyways? (https://simplicityissota.substack.com/p/what-is-an-embedding-anyways ) 53 Why embeddings? Vectors are a compact, universal representation of data 443B 3.4MB 0.1231 0.2439 0.3531 0.7412 0.6315 0.2612 … … … 4.1MB 0.6221 0.7102 0.6820 1.1GB 54 What makes a good embedding? 55 What makes a good embedding? Utility for the downstream task Similar things should be close together 56 Utility for the downstream task https://huggingface.co/spaces/mteb/leaderboard 57 Similar things should be close together How can we tell if two things are close to each other? coffee, tea ball, crocodile 58 Similar things should be close, different things far Cosine Similarity 59 What is an embedding? What is it good for? Embeddings are an abstract, dense, compact, xed-size, (usually) learned representation of data “Sparse” representation “Dense” representation Contains “coffee” 0.1231 Contains “tea” 0.7412 … … Contains “laptop” 0.6221 62 Embeddings t o know The Very First: Word2Vec The baseline: Sentence transformers A multimodal option (text, image): CLIP The one to use: OpenAI Where things are going: Instructor 63 Good, fast, and cheap: OpenAI embeddings Use text-embedding-3-small: efficiency and performance OR text-embedding-3-large: It generates embeddings with up to 3,072 dimensions, providing detailed and nuanced text representations. Easy to use, good results in practice 67 Exercise: train an embedding for web pages Design an approach to train an embedding for web pages Goal: learn what features a web page provides: e.g., login, user registration, purchasing an item, search, … Write the steps down and hand it in! You can work in groups of two. Write both names/student nr on the paper. 71 All about embeddings AI-powered retrieval Embedding relevance and indexes via embeddings Embedding databases Beyond naive nearest neighbor A minimal recipe for nearest neighbor similarity Calculate embeddings for your corpus Store embeddings as an array of vectors Calculate the embedding of any query and save as a vector q compute cosine similarity between q and the vectors in the array https://www.ethanrosenthal.com/2023/04/10/nn -vs-ann/ 77 When do you need more than that? If you have

EU AI Act & AI Regulation - Augmented LLM

Document Details

Tags

Related

Summary

Full Transcript