📚 Glossary

Vision model

In one line: An AI model that can understand images, not just text. Most modern flagship LLMs are now vision models.

A vision model is an LLM that can process images alongside text. Send it a screenshot, photo, diagram or chart and it can describe, analyse or answer questions about what's in it.

Common use cases:

'Extract the text from this receipt'
'What's wrong with this UI screenshot?'
'Describe this chart in plain English'
'Translate the text in this menu'
'Solve this handwritten math problem'

Vision-capable models on AskAI.free: ChatGPT 4o, Claude Sonnet 4, Gemini 2.5 Pro. Image upload is a Pro feature.

See it in action — ask any AI about vision model on AskAI.free.

Try it free →

Uh-oh!

Sign In

Create Account

Pick your plan

Vision model

Related terms