Definition
Transformer
A transformer is a neural-network architecture that uses attention mechanisms to relate parts of an input sequence to each other. Transformers made modern LLMs and many multimodal models practical because they handle long context, parallel training, and flexible sequence tasks well.
Last updated: 25 June 2026
Why it matters
It helps non-specialists understand why context, tokens, and model size are central to modern generative AI.
Signals to watch
- Attention layers are used
- Inputs are processed as sequences
- Token context shapes the output