Home › Glossary › Context Window
The context window is the maximum amount of text (tokens) a language model can consider in a single request. Larger windows allow for more complex inputs and longer documents.
Context windows have grown dramatically: early GPT models had 2K-token windows (~1,500 words); modern models support 100K-2M+ tokens (75K-1.5M+ words). Larger context allows analyzing long documents, processing entire codebases, or maintaining long conversation histories without losing earlier context. As of 2026, Claude has industry-leading context windows (200K+); GPT-4 Turbo and Gemini 1.5 Pro also offer 128K-1M+.
An analyst uploads a 50-page research report (~30,000 words / ~40,000 tokens) to Claude with a 200K context window. Claude processes the entire document and answers questions citing specific sections. The same task in a model with a 4K context window would require manually splitting the document.
Larger context window vs RAG: Large context windows let you dump everything in; RAG retrieves only relevant pieces. Both work; large windows are simpler, RAG scales to bigger corpora. Token cost implications: Most APIs charge per token. Larger context = higher per-request cost. Free tiers usually have lower context limits.