Transformer
In one line: The neural network architecture introduced in 2017 that powers every modern LLM — ChatGPT, Claude, Gemini, all of it.
The transformer is the neural network architecture introduced in the 2017 paper Attention Is All You Need by Google researchers. Every modern LLM — ChatGPT, Claude, Gemini, DeepSeek, Llama — is a transformer.
Key innovation: instead of processing text sequentially (like RNNs), transformers use attention to look at all input tokens at once and figure out which ones matter most. This made them much faster to train and more capable on long-range context.
The 'GPT' in GPT stands for Generative Pre-trained Transformer. The transformer is the engine of the entire modern AI boom.
See it in action — ask any AI about transformer on AskAI.free.
Try it free →