Back Professions
Back Dating
Back Writing Tools
Back Programming Tools
Back AI Chat
Back AI Image
Back AI Video
Glossary

Foundation model

In one line: A large general-purpose AI model (like GPT-4 or Claude Sonnet) that's been trained on broad data and can be adapted for many tasks.

What is Foundation model?

A foundation model is a large AI model trained on broad, general-purpose data and designed to serve as a 'foundation' that can be adapted to many downstream tasks through prompting, fine-tuning, or RAG.

The term was coined by Stanford researchers in 2021 to describe a fundamental shift in how AI development works. Instead of training one narrow model per task (a spam classifier, a translator, a summariser), you train one large model on vast general data, then adapt it cheaply to specific use cases. This became possible thanks to the transformer architecture and scaling laws showing that larger models on more data reliably become more capable.

What makes something a foundation model

  • Scale - Billions of parameters, trained on trillions of tokens of text, code, or multimodal data.
  • Generality - Capable of many tasks without task-specific training - writing, coding, reasoning, translation, summarisation.
  • Adaptability - Can be fine-tuned, prompted, or combined with retrieval systems to specialise for virtually any domain.
  • Emergent abilities - As models scale, new capabilities appear that were not explicitly trained for - few-shot learning, chain-of-thought reasoning, code generation.

Foundation model examples

ModelMakerTypeOpen weights?
GPT-4oOpenAIMultimodal LLMNo
Claude Sonnet 4AnthropicMultimodal LLMNo
Gemini 2.0 FlashGoogle DeepMindMultimodal LLMNo
DeepSeek R1DeepSeekReasoning LLMYes
Llama 3MetaText LLMYes
Mistral LargeMistral AIText LLMPartial
DALL-E 3OpenAIVision modelNo
Stable Diffusion 3Stability AIImage generationYes

The foundation model ecosystem

Foundation models have stratified the AI industry into two tiers. A handful of well-capitalised labs - OpenAI, Anthropic, Google, Meta, and Mistral - train foundation models at enormous cost (GPT-4 reportedly cost $50-100M). The vast majority of AI products are then built on top of these foundations using APIs, fine-tuning, and RAG.

Open-weights models (Llama, Mistral, DeepSeek) are foundation models where the weights are publicly released, allowing anyone to run, fine-tune, or study them. This creates a parallel open-source ecosystem that keeps the proprietary labs competitive on cost.

See the ChatGPT vs Claude comparison and the Claude vs Gemini comparison for side-by-side analysis of leading foundation models. Related: LLM, transformer, fine-tuning.

Foundation model example

If you are using AskAI.free, a practical way to understand foundation model is to ask a model to explain it, then ask for a concrete example in your own workflow. For example: "Explain foundation model for someone using AI to write, code, research, or create images."

This turns the term from a dictionary definition into a decision-making tool: you can see when it affects prompt quality, model choice, output reliability, privacy, cost, or how much context the AI can use.

Why Foundation model matters

Foundation model matters because it changes how you choose, prompt, compare or trust AI systems. If you understand this term, you can ask better questions, spot weak answers faster and choose the right model or tool for the job.

A common mistake is treating foundation model as isolated jargon. It usually connects to nearby ideas like GPT and Gemini, so check those next if you want the full picture.

Common mistake with Foundation model

The most common mistake is using the term as a label without changing behavior. When foundation model comes up, ask what action should change: the prompt, the model, the input length, the evidence you request, or the way you verify the answer.

See it in action - Ask any AI about foundation model on AskAI.free.

Try it free →