A context window is the maximum amount of text a language model can process in a single interaction, measured in tokens. Everything the model can 'see' at once (your prompt, the conversation history, any documents you've pasted in) must fit inside this window. Early GPT models had context windows of 4,000 tokens, roughly 3,000 words. Modern models like Claude and GPT-4 have windows of 128,000 to 200,000 tokens or more, enough to hold an entire novel. When content exceeds the context window, the model either truncates it or cannot process it at all. The model has no memory of what fell outside the window. This is why very long conversations can cause models to 'forget' earlier messages. Context window size directly determines what tasks a model can perform. A small window can answer questions and write short documents. A large window can analyze entire codebases, summarize lengthy reports, or maintain coherent long-form conversations. The race to extend context windows is one of the central engineering challenges in LLM development.
Back to Glossary