Glossary

Parameter

In one line: A single number inside a neural network. LLMs have billions of parameters - The bigger the count, the more capable (usually).

What is Parameter?

A parameter is a single learnable number inside a neural network. Each connection between neurons has a weight; each neuron has a bias. Every weight and bias is a parameter. During training, all parameters are adjusted billions of times to minimise prediction error. After training is complete, the parameters are frozen - they encode everything the model has learned about language, facts, and reasoning.

Parameter counts in modern LLMs

Parameter counts vary enormously across models. The table below shows where major models sit as of 2026. Note that proprietary labs rarely disclose exact figures.

Model	Parameters	Open weights	Notes
GPT-3	175B	No	The model that launched the LLM era (2020)
GPT-4o	~200B est.	No	Mixture-of-experts; actual count undisclosed
Claude Sonnet 4	Undisclosed	No	Anthropic does not publish counts
Gemini 2.0 Flash	Undisclosed	No	Optimised for speed and cost
DeepSeek R1	671B	Yes	MoE; only ~37B active per token
Llama 3 70B	70B	Yes	Meta's open-weights flagship
Mistral 7B	7B	Yes	Highly efficient for its size

More parameters = better?

Mostly yes, but the relationship is not linear. Three factors complicate the raw-count story:

Training quality matters as much as size. A 7B model trained on clean, carefully curated data can outperform a 70B model trained carelessly on noisy text.
Architecture design changes everything. Transformer variants like Mixture of Experts (MoE) decouple stored parameters from active parameters. DeepSeek R1 has 671B parameters but activates only around 37B for any given token - making it cost-competitive with much smaller dense models at inference time.
Benchmark coverage is uneven. A model can score brilliantly on coding benchmarks while being mediocre at creative writing. Parameter count tells you nothing about where a model's strengths lie.

Labs have largely stopped advertising raw parameter counts as a marketing figure. Capability benchmarks, arena rankings, and real-world user tests are more meaningful than the number of weights in a model file.

Parameter efficiency techniques

Researchers have developed several ways to get more from fewer active parameters, making large models cheaper to run without sacrificing much quality:

Quantisation - Compressing parameters from 32-bit floats to 8-bit or 4-bit integers. This cuts memory and compute by 4-8x at a small accuracy cost, making large models runnable on consumer GPUs.
Mixture of Experts (MoE) - Only a fraction of parameters activate for any given input. A router network selects the right expert sub-networks, so you store a large model but compute like a small one.
LoRA fine-tuning - Rather than updating all billions of parameters, LoRA adds small trainable adapter matrices and freezes the base model weights, giving task-specific adaptation at a tiny fraction of the cost of full fine-tuning.
Pruning - Identifying and removing parameters that contribute little to model outputs, reducing model size without retraining from scratch.

Parameter count is public for open-weights models like Llama and DeepSeek but deliberately undisclosed for most proprietary models. To compare models on what actually matters, see the ChatGPT vs Claude comparison or explore all available models on the pricing page.

Parameter example

If you are using AskAI.free, a practical way to understand parameter is to ask a model to explain it, then ask for a concrete example in your own workflow. For example: "Explain parameter for someone using AI to write, code, research, or create images."

This turns the term from a dictionary definition into a decision-making tool: you can see when it affects prompt quality, model choice, output reliability, privacy, cost, or how much context the AI can use.

Why Parameter matters

Parameter matters because it changes how you choose, prompt, compare or trust AI systems. If you understand this term, you can ask better questions, spot weak answers faster and choose the right model or tool for the job.

A common mistake is treating parameter as isolated jargon. It usually connects to nearby ideas like Prompt and Prompt engineering, so check those next if you want the full picture.

Common mistake with Parameter

The most common mistake is using the term as a label without changing behavior. When parameter comes up, ask what action should change: the prompt, the model, the input length, the evidence you request, or the way you verify the answer.

See it in action - Ask any AI about parameter on AskAI.free.

Try it free →

Uh-oh!

Sign In

Create Account

Pick your plan

Parameter

What is Parameter?

Parameter counts in modern LLMs

More parameters = better?

Parameter efficiency techniques

Parameter example

Why Parameter matters

Common mistake with Parameter

Related Terms