Large Language Models Overview

Study Notes

LLMs generate data sequentially based on conditional probabilities.
Transformers, a type of neural network, are used as the core of many LLMs.
Transformers can capture long-range context through self-attention mechanisms.
LLMs are able to generate text that is contextually relevant, tailored to specific tasks or conditions, thanks to Transformers.

LLMs can be used for a wide range of applications.
For instance, they can assist in customer service, research, legal discovery, and financial analysis.

LLMs can lack control over specific attributes or features of the generated output.
If training data is biased, limited in scope, or contains errors, these shortcomings may be reflected in the generated output.
The use of LLMs raises ethical concerns as generated content could be misused.

Prompts are instructions and context passed to a language model to achieve a desired task.

RAG combines the capabilities of LLMs with external information.
It's similar to an open-book exam, allowing the LLM to access and leverage external knowledge.
RAG involves a five-step process:
- Data Collection and Extraction
- Data Chunking
- Document Embeddings
- Storing Embeddings in a Vectorstore
- Retrieval
- Generation

RAG collects data from various sources like PDFs, websites, databases, and manuals.

Words and sentences are represented numerically as vectors to capture semantic meaning.

Splitted documents are transformed into embeddings, which help the system understand user queries and match them with relevant chunks.

Embeddings are stored in a vectorstore, a specialized database that allows for quick search and retrieval of similar vectors.

User queries are transformed into embeddings and compared against embeddings in the vectorstore to identify the most relevant chunks.

The LLM generates the answer based on user's query, retrieved context, and system instructions.

Customer Services: RAG-powered chatbots can leverage knowledge bases and customer history to provide personalized support.
Research Market Intelligence: RAG helps to quickly synthesize insights from large volumes of data.
Legal Discovery: RAG helps legal professionals quickly find relevant precedents and arguments across large collections of case law.
Financial Analysis: RAG can ingest earnings statements, press releases, and regulatory filings to generate investment insights and trading signals.

Parent Document Retriever: RAG systems can be built with hierarchical structures.
Hybrid Fusion Search: Combines keyword search and semantic search for more comprehensive results.
Contextual Compressor: This technique helps deliver personalized experiences based on user preferences.

Personalized AI authors can assist with tasks like co-authoring emails and preparing for meetings.
AI authors can utilize previous work to generate new content tailored to specific audiences and needs.