Foundation model
In one line: A large general-purpose AI model (like GPT-4 or Claude Sonnet) that's been trained on broad data and can be adapted for many tasks.
What is Foundation model?
A foundation model is a large AI model trained on broad, general-purpose data and designed to serve as a 'foundation' that can be adapted to many downstream tasks through prompting, fine-tuning, or RAG.
The term was coined by Stanford researchers in 2021 to describe a fundamental shift in how AI development works. Instead of training one narrow model per task (a spam classifier, a translator, a summariser), you train one large model on vast general data, then adapt it cheaply to specific use cases. This became possible thanks to the transformer architecture and scaling laws showing that larger models on more data reliably become more capable.
What makes something a foundation model
- Scale - Billions of parameters, trained on trillions of tokens of text, code, or multimodal data.
- Generality - Capable of many tasks without task-specific training - writing, coding, reasoning, translation, summarisation.
- Adaptability - Can be fine-tuned, prompted, or combined with retrieval systems to specialise for virtually any domain.
- Emergent abilities - As models scale, new capabilities appear that were not explicitly trained for - few-shot learning, chain-of-thought reasoning, code generation.
Foundation model examples
| Model | Maker | Type | Open weights? |
|---|---|---|---|
| GPT-4o | OpenAI | Multimodal LLM | No |
| Claude Sonnet 4 | Anthropic | Multimodal LLM | No |
| Gemini 2.0 Flash | Google DeepMind | Multimodal LLM | No |
| DeepSeek R1 | DeepSeek | Reasoning LLM | Yes |
| Llama 3 | Meta | Text LLM | Yes |
| Mistral Large | Mistral AI | Text LLM | Partial |
| DALL-E 3 | OpenAI | Vision model | No |
| Stable Diffusion 3 | Stability AI | Image generation | Yes |
The foundation model ecosystem
Foundation models have stratified the AI industry into two tiers. A handful of well-capitalised labs - OpenAI, Anthropic, Google, Meta, and Mistral - train foundation models at enormous cost (GPT-4 reportedly cost $50-100M). The vast majority of AI products are then built on top of these foundations using APIs, fine-tuning, and RAG.
Open-weights models (Llama, Mistral, DeepSeek) are foundation models where the weights are publicly released, allowing anyone to run, fine-tune, or study them. This creates a parallel open-source ecosystem that keeps the proprietary labs competitive on cost.
See the ChatGPT vs Claude comparison and the Claude vs Gemini comparison for side-by-side analysis of leading foundation models. Related: LLM, transformer, fine-tuning.
Foundation model example
If you are using AskAI.free, a practical way to understand foundation model is to ask a model to explain it, then ask for a concrete example in your own workflow. For example: "Explain foundation model for someone using AI to write, code, research, or create images."
This turns the term from a dictionary definition into a decision-making tool: you can see when it affects prompt quality, model choice, output reliability, privacy, cost, or how much context the AI can use.
Why Foundation model matters
Foundation model matters because it changes how you choose, prompt, compare or trust AI systems. If you understand this term, you can ask better questions, spot weak answers faster and choose the right model or tool for the job.
A common mistake is treating foundation model as isolated jargon. It usually connects to nearby ideas like GPT and Gemini, so check those next if you want the full picture.
Common mistake with Foundation model
The most common mistake is using the term as a label without changing behavior. When foundation model comes up, ask what action should change: the prompt, the model, the input length, the evidence you request, or the way you verify the answer.
See it in action - Ask any AI about foundation model on AskAI.free.
Try it free →