AI & LLM Glossary
Plain-English definitions of the AI terms you'll see thrown around. From attention to zero-shot, with links to the models and tools where each concept matters.
Core AI terms
- LLM (Large Language Model)An AI model trained on huge amounts of text that can read and generate natural language. ChatGPT, Claude and Gemini are all LLMs.
- Foundation modelA large general-purpose AI model (like GPT-4 or Claude Sonnet) that's been trained on broad data and can be adapted for many tasks.
- TransformerThe neural network architecture introduced in 2017 that powers every modern LLM - ChatGPT, Claude, Gemini, all of it.
- TokenThe unit AI models read and write in. Roughly 4 characters or 0.75 words. Pricing and context windows are measured in tokens.
- Context windowHow much text an AI can process at once, measured in tokens. A bigger context window means analysing longer documents, longer chats, and more complex tasks.
- MultimodalA model that can handle multiple input types - Text, images, audio, video - Not just text.
Prompting terms
- PromptWhat you type to an AI. The art of writing effective prompts is called prompt engineering.
- Prompt engineeringThe discipline of writing AI prompts that consistently produce good answers. Includes techniques like few-shot, CoT, and role assignment.
- System promptAn instruction sent before your message that shapes the AI's persona, tone and constraints for the rest of the conversation.
- Chain of thoughtA prompting technique where the AI explains its reasoning step by step before giving a final answer - Usually more accurate than direct answers.
- Zero-shotAsking the AI to do something without giving it any examples. The opposite of few-shot prompting.
- TemperatureA setting that controls AI creativity. 0 = deterministic and predictable. 1 = more varied, creative, sometimes unhinged.
Models and providers
- GPTGenerative Pre-trained Transformer. The model architecture and naming convention used by OpenAI for ChatGPT.
- ClaudeAnthropic's family of AI models, known for thoughtful, careful, long-form writing. Available via claude.ai or AskAI.free.
- GeminiGoogle's flagship AI model family. Gemini 2.5 Pro leads on long context (2M tokens) and multimodal tasks; Gemini 2.0 Flash is the fast, free-tier option.
- OpenAIThe AI lab behind ChatGPT and the GPT family of models. Founded 2015, now valued in the hundreds of billions.
- AnthropicThe AI lab behind Claude. Founded by ex-OpenAI researchers focused on AI safety. Major investors include Amazon and Google.
- SonnetAnthropic's mid-tier Claude model - The flagship product line that most people use day-to-day.
Accuracy and safety
- HallucinationWhen an AI confidently states something false. The biggest reliability issue with LLMs - Understanding hallucinations helps you use AI more safely.
- RAG (Retrieval-Augmented Generation)Grounding AI answers in your own documents - Retrieve relevant context first, then generate the answer. The key solution to knowledge cutoffs.
- Knowledge cutoffThe date after which the AI doesn't know about world events. ChatGPT, Claude and Gemini all have one - For current events use Perplexity.
- JailbreakA prompt that tricks an AI into ignoring its safety training and doing something it normally refuses.
- Constitutional AIAnthropic's training method where Claude is trained against a written 'constitution' of values - Rather than ad-hoc human feedback for every example.
- Reasoning modelAn AI model that explicitly 'thinks' before answering by generating a long chain of thought. Better at math, code and logic - Slower than chat models.
A
-
AI agentAn AI system that can take actions on your behalf - Calling tools, browsing the web, writing files - Not just answering with text.
-
AlignmentThe research problem of making AI systems do what humans actually want - Not just what we ask for literally.
-
AttentionThe mathematical mechanism that lets transformers focus on different parts of the input when generating each output token.
-
AnthropicThe AI lab behind Claude. Founded by ex-OpenAI researchers focused on AI safety. Major investors include Amazon and Google.
B
C
-
Chain of thoughtA prompting technique where the AI explains its reasoning step by step before giving a final answer - Usually more accurate than direct answers.
-
ChatGPTOpenAI's flagship AI chat product. Powered by the GPT family of models (4o, 4.1, o3, etc.).
-
ClaudeAnthropic's family of AI models, known for thoughtful, careful, long-form writing. Available via claude.ai or AskAI.free.
-
Constitutional AIAnthropic's training method where Claude is trained against a written 'constitution' of values - Rather than ad-hoc human feedback for every example.
-
Context windowHow much text an AI can process at once, measured in tokens. A bigger context window means analysing longer documents, longer chats, and more complex tasks.
E
F
G
H
I
J
K
L
M
N
O
P
-
ParameterA single number inside a neural network. LLMs have billions of parameters - The bigger the count, the more capable (usually).
-
PromptWhat you type to an AI. The art of writing effective prompts is called prompt engineering.
-
Prompt engineeringThe discipline of writing AI prompts that consistently produce good answers. Includes techniques like few-shot, CoT, and role assignment.
R
-
RAG (Retrieval-Augmented Generation)Grounding AI answers in your own documents - Retrieve relevant context first, then generate the answer. The key solution to knowledge cutoffs.
-
Reasoning modelAn AI model that explicitly 'thinks' before answering by generating a long chain of thought. Better at math, code and logic - Slower than chat models.
-
Reinforcement learning (RL)A training technique where the AI improves by trial and error, getting rewards for good outputs. The 'F' in RLHF.
S
T
-
TemperatureA setting that controls AI creativity. 0 = deterministic and predictable. 1 = more varied, creative, sometimes unhinged.
-
TokenThe unit AI models read and write in. Roughly 4 characters or 0.75 words. Pricing and context windows are measured in tokens.
-
TokenizerThe component that converts text into tokens (and back). Different models use different tokenizers, which is why a sentence has a different token count in GPT vs Claude.
-
Training dataThe text an LLM is taught on. Typically trillions of tokens scraped from the web, books, code repos and more.
-
TransformerThe neural network architecture introduced in 2017 that powers every modern LLM - ChatGPT, Claude, Gemini, all of it.
-
Tool use (function calling)When an AI model can call external functions - Search, calculator, database - Instead of just generating text.